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HETEROLOGOUS EXPRESSION OF NEISSERIAL PROTEINS 

All documents cited herein are incorporated by reference in their entirety. 

TECHNICAL FIELD 

This invention is in the field of protein expression. In particular, it relates to the heterologous 
5 expression of proteins from Neisseria {e.g. N gonorrhoeae or, preferably, N. meningitidis). 

BACKGROUND ART 

International patent applications W099/24578, W099/36544, WO99/57280 and 
WO00/22430 disclose proteins from Neisseria meningitidis and Neisseria gonorrhoeae. 
These proteins are typically described as being expressed in E.coli (i.e. heterologous 
10 expression) as either N-terminal GST-fusions or C-terminal His-tag fusions, although other 
expression systems, including expression in native Neisseria, are also disclosed. 

It is an object of the present invention to provide alternative and improved approaches for 
the heterologous expression of these proteins. These approaches will typically affect the 
level of expression, the ease of purification, the cellular localisation of expression, and/or the 
15 immunological properties of the expressed protein. 

DISCLOSURE OF THE INVENTION 
Nomenclature herein 

The 2166 protein sequences disclosed in W099/24578, W099/36544 and WO99/57280 are 
referred to herein by the following SEQ# numbers: 



Application 


Protein sequences 


SEQ# herein 


W099/24578 


Even SEQ IDs 2-892 


SEQ#s 1-446 


W099/36544 


Even SEQ IDs 2-90 


SEQ#s 447-491 




Even SEQ IDs 2-3020 


SEQ#s 492-2001 


WO99/57280 


Even SEQ IDs 3040-3114 


SEQ#s 2002-2039 




SEQ IDs 3115-3241 


SEQ#s 2040-2166 



20 In addition to this SEQ# numbering, the naming conventions used in W099/24578, 
W099/36544 and WO99/57280 are also used (e.g. 'ORF4', 'ORF40\ 'ORF40-1' etc. as 
used in W099/24578 and W099/36544; 'm919\ 'g919' and 'a919' etc. as used in 
WO99/57280). 
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The 2160 proteins NMB0001 to NMB2160 from Tettelin et al [Science (2000) 287:1809- 
1815] are referred to herein as SEQ#s 2167-4326 [see also WO00/66791]. 

The term 'protein of the invention' as used herein refers to a protein comprising: 

(a) one of sequences SEQ#s 1-4326; or 

(b) a sequence having sequence identity to one of SEQ#s 1-4326; or 

(c) a fragment of one of SEQ#s 1-4326. 

The degree of 'sequence identity' referred to in (b) is preferably greater than 50% (eg. 60%, 
70%, 80%, 90%, 95%, 99% or more). This includes mutants and allelic variants [e.g. see 
WO00/66741]. Identity is preferably determined by the Smith-Waterman homology search 
algorithm as implemented in the MPSRCH program (Oxford Molecular), using an affine gap 
search with parameters gap open penalty =12 and gap extension penalty =1. Typically, 50% 
identity or more between two proteins is considered to be an indication of functional 
equivalence. 

The 'fragment' referred to in (c) should comprise at least n consecutive amino acids from 
one of SEQ#s 1-4326 and, depending on the particular sequence, n is 7 or more (eg. 8, 10, 
12, 14, 16, 18, 20, 25, 30, 35, 40, 50, 60, 70, 80, 90, 100 or more). Preferably the fragment 
comprises an epitope from one of SEQ#s 1-4326. Preferred fragments are those disclosed in 
WO00/71574 and WO01/04316. 

Preferred proteins of the invention are found in N. meningitidis serogroup B. 

Preferred proteins for use according to the invention are those of serogroup B N .meningitidis 
strain 2996 or strain 394/98 (a New Zealand strain). Unless otherwise stated, proteins 
mentioned herein are from N. meningitidis strain 2996. It will be appreciated, however, that 
the invention is not in general limited by strain. References to a particular protein (e.g. '287', 
'919' etc.) may be taken to include that protein from any strain. 

Non-fusion expression 

In a first approach to heterologous expression, no fusion partner is used, and the native 
leader peptide (if present) is used. This will typically prevent any 'interference' from fusion 
partners and may alter cellular localisation and/or post-translational modification and/or 
folding in the heterologous host. 
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Thus the invention provides a method for the heterologous expression of a protein of the 
invention, in which (a) no fusion partner is used, and (b) the protein's native leader peptide 
(if present) is used. 

The method will typically involve the step of preparing an vector for expressing a protein of 
5 the invention, such that the first expressed amino acid is the first amino acid (methionine) of 
said protein, and last expressed amino acid is the last amino acid of said protein the 
codon preceding the native STOP codon). 

This approach is preferably used for the expression of the following proteins using the native 
leader peptide: 111, 149, 206, 225-1, 235, 247-1, 274, 283, 286, 292, 401, 406, 502-1, 503, 
10 519-1, 525-1, 552, 556, 557, 570, 576-1, 580, 583, 664, 759, 907, 913, 920-1, 936-1, 953, 
961, 983, 989, Orf4, Orf7-l, Orf9-l, Orf23, Orf25, <Orf37, Orf38, Orf40, Orf40.1, Orf40.2, 
Orf72-l, Orf76-l, Orf85-2, Orf91, Orf97-l, Orfll9, Orfl43.1, NMB0109 and NMB2050. 
The suffix 'L' used herein in the name of a protein indicates expression in this manner using 
the native leader peptide. 

15 Proteins which are preferably expressed using this approach using no fusion partner and 
which have no native leader peptide include: 008, 105, 117-1, 121-1, 122-1, 128-1, 148, 
216, 243, 308, 593, 652, 726, 926, 982, Orf83-l and Orfl43-l. 

Advantageously, it is used for the expression of ORF25 or ORF40, resulting in a protein 
which induces better anti-bactericidal antibodies than GST- or His-fusions. 

20 This approach is particularly suited for expressing lipoproteins. 
Leader-peptide substitution 

In a second approach to heterologous expression, the native leader peptide of a protein of the 
invention is replaced by that of a different protein. In addition, it is preferred that no fusion 
partner is used. Whilst using a protein's own leader peptide in heterologous hosts can often 
25 localise the protein to its 'natural' cellular location, in some cases the leader sequence is not 
efficiently recognised by the heterologous host. In such cases, a leader peptide known to 
drive protein targeting efficiently can be used instead. 

Thus the invention provides a method for the heterologous expression of a protein of the 
invention, in which (a) the protein's leader peptide is replaced by the leader peptide from a 
30 different protein and, optionally, (b) no fusion partner is used. 
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The method will typically involve the steps of: obtaining nucleic acid encoding a protein of 
the invention; manipulating said nucleic acid to remove nucleotides that encode the protein's 
leader peptide and to introduce nucleotides that encode a different protein's leader peptide. 
The resulting nucleic acid may be inserted into an expression vector, or may already be part 
5 of an expression vector. The expressed protein will consist of the replacement leader peptide 
at the N-terminus, followed by the protein of the invention minus its leader peptide. 

The leader peptide is preferably from another protein of the invention (e.g. one of SEQ#s 
1-4326), but may also be from an Exoli protein (e.g. the OmpA leader peptide) or an 
Erwinia carotovora protein (e.g. the PelB leader peptide), for instance. 

10 A particularly useful replacement leader peptide is that of ORF4. This leader is able to direct 
lipidation in Exoli, improving cellular localisation, and is particularly useful for the 
expression of proteins 287, 919 and AG287. The leader peptide and N-terminal domains of 
961 are also particularly useful. 

Another useful replacement leader peptide is that of E.coli OmpA. This leader is able to 
15 direct membrane localisation of Exoli. It is particularly advantageous for the expression of 
ORF1, resulting in a protein which induces better anti-bactericidal antibodies than both 
fusions and protein expressed from its own leader peptide. 

Another useful replacement leader peptide is MKKYLFSAA. This can direct secretion into 
culture medium, and is extremely short and active. The use of this leader peptide is not 
20 restricted to the expression of Neisserial proteins - it may be used to direct the expression of 
any protein (particularly bacterial proteins). 

Leader-peptide deletion 

In a third approach to heterologous expression, the native leader peptide of a protein of the 
invention is deleted. In addition, it is preferred that no fusion partner is used. 

25 Thus the invention provides a method for the heterologous expression of a protein of the 

r 

invention, in which (a) the protein's leader peptide is deleted and, optionally, (b) no fusion 
partner is used. 

The method will typically involve the steps of: obtaining nucleic acid encoding a protein of 
the invention; manipulating said nucleic acid to remove nucleotides that encode the protein's 
30 leader peptide. The resulting nucleic acid may be inserted into an expression vector, or may 
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already be part of an expression vector. The first amino acid of the expressed protein will be 
that of the mature native protein. 

This method can increase the levels of expression. For protein 919, for example, expression 
levels in E.coli are much higher when the leader peptide is deleted. Increased expression 
5 may be due to altered localisation in the absence of the leader peptide. 

The method is preferably used for the expression of 919, ORF46, 961, 050-1, 760 and 287. 
Domain-based expression 

In a fourth approach to heterologous expression, the protein is expressed as domains. This 
may be used in association with fusion systems (e.g. GST or His-tag fusions). 

10 Thus the invention provides a method for the heterologous expression of a protein of the 
invention, in which (a) at least one domain in the protein is deleted and, optionally, (b) no 
fusion partner is used. 

The method will typically involve the steps of: obtaining nucleic acid encoding a protein of 
the invention; manipulating said nucleic acid to remove at least one domain from within the 
15 protein. The resulting nucleic acid may be inserted into an expression vector, or may already 
be part of an expression vector. Where no fusion partners are used, the first amino agid of the 
expressed protein will be that of a domain of the protein. 

A protein is typically divided into notional domains by aligning it with known sequences in 
databases and then determining regions of the protein which show different alignment 
20 patterns from each other. 

The method is preferably used for the expression of protein 287. This protein can be 
notionally split into three domains, referred to as A B & C (see Figure 5). Domain B aligns 
strongly with IgA proteases, domain C aligns strongly with transferrin-binding proteins, and 
domain A shows no strong alignment with database sequences. An alignment of 
25 polymorphic forms of 287 is disclosed in WO00/66741. 

Once a protein has been divided into domains, these can be (a) expressed singly (b) deleted 
from with the protein e.g. protein ABCD -> ABD, ACD, BCD etc. or (c) rearranged e.g. 
protein ABC — > ACB, CAB etc. These three strategies can be combined with fusion partners 
is desired. 
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ORF46 has also been notionally split into two domains - a first domain (amino acids 1-433) 
which is well-conserved between species and serogroups, and a second domain (amino acids 
433-608) which is not well-conserved. The second domain is preferably deleted. An 
alignment of polymorphic forms of ORF46 is disclosed in WO00/66741. 

5 Protein 564 has also been split into domains (Figure 8), as have protein 961 (Figure 12) and 
protein 502 (amino acids 28-167 of the MC58 protein). 

Hybrid proteins 

In a fifth approach to heterologous expression, two or more (e.g. 3, 4, 5, 6 or more) proteins 
of the invention are expressed as a single hybrid protein. It is preferred that no 
10 non-Neisserial fusion partner (e.g. GST or poly-His) is used. 

This offers two advantages. Firstly, a protein that may be unstable or poorly expressed on its 
own can be assisted by adding a suitable hybrid partner that overcomes the problem. 
Secondly, commercial manufacture is simplified - only one expression and purification need 
be employed in order to produce two separately-useful proteins. 

15 Thus the invention provides a method for the simultaneous heterologous expression of two 
or more proteins of the invention, in which said two or more proteins of the invention are 
fused (Le. they are translated as a single polypeptide chain). 

The method will typically involve the steps of: obtaining a first nucleic acid encoding a first 
protein of the invention; obtaining a second nucleic acid encoding a second protein of the 
20 invention; ligating the first and second nucleic acids. The resulting nucleic acid may be 
inserted into an expression vector, or may already be part of an expression vector. 

Preferably, the constituent proteins in a hybrid protein according to the invention will be 
from the same strain. 

The fused proteins in the hybrid may be joined directly, or may be joined via a linker peptide 
25 e.g. via a poly-glycine linker (Le. G n where n = 3, 4, 5, 6, 7, 8, 9, 10 or more) or via a short 
peptide sequence which facilitates cloning. It is evidently preferred not to join a AG protein 
to the C-terminus of a poly-glycine linker. 

The fused proteins may lack native leader peptides or may include the leader peptide 
sequence of the N-terminal fusion partner. 



WO 01/64922 



PCT/IB01/00452 



-7- 

The method is well suited to the expression of proteins orfl, orf4, orf25, orf40, Orf46/46.1, 
orf83, 233, 287, 292L, 564, 687, 741, 907, 919, 953, 961 and 983. 

The 42 hybrids indicated by 'X' in the following table of form NH2-A— B-COOH are 
preferred: 





ORF46.1 


287 


741 


919 


953 


961 


983 


ORF46;l 




X 


X 


X 


X 


X 


X 


: 287 


X 




X 


X 


X 


X 


X 


741 


X 


X 




X 


X 


X 


X 


919 


X 


X 


X 




X 


X 


X' 


953 


X 


X 


X 


X 




X 


X 


961 


X 


X 


X 


X 


X 




X 


983 


X 


X 


X 


X 


X 


X 





5 Prefeired proteins to be expressed as hybrids are thus ORF46.1, 287, 741, 919, 953, 961 and 
983. These may be used in their essentially full-length form, or poly-glycine deletions (AG) 
forms may be used (e.g. AG-287, AGTbp2 3 AG741, AG983 etc.), or truncated forms may be 
used (e.g. A 1-287, A2-287 etc.), or domain-deleted versions may be used (e.g. 287B, 287C, 
287BC, ORF46i_43 3 , ORF46 4 33-608, ORF46, 961c etc.). 

10 Particularly preferred are: (a) a hybrid protein comprising 919 and 287; (b) a hybrid protein 
comprising 953 and 287; (c) a hybrid protein comprising 287 and ORF46.1; (d) a hybrid 
protein comprising ORF1 and ORF46.1; (e) a hybrid protein comprising 919 and ORF46.1; 
(f) a hybrid protein comprising ORF46.1 and 919; (g) a hybrid protein comprising ORF46.1, 
287 and 919; (h) a hybrid protein comprising 919 and 519; and (i) a hybrid protein 

15 comprising ORF97 and 225. Further embodiments are shown in Figure 14. 

Where 287 is used, it is preferably at the C-terminal end of a hybrid; if it is to be used at the 
N-terminus, if is preferred to use a AG form of 287 is used (e.g. as the N-terminus of a 
hybrid with ORF46.1, 919, 953 or 961). 

Where 287 is used, this is preferably from strain 2996 or from strain 394/98. 

20 Where 961 is used, this is preferably at the N-terminus. Domain forms of 961 may be used. 

Alignments of polymorphic forms of ORF46, 287, 919 and 953 are disclosed in 
WO00/66741. Any of these polymorphs can be used according to the present invention. 
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Temperature 

In a sixth approach to heterologous expression, proteins of the invention are expressed at a 
low temperature. 

Expressed Neisserial proteins {e.g. 919) may be toxic to E.colU which can be avoided by 
5 expressing the toxic protein at a temperature at which its toxic activity is not manifested. 

Thus the present invention provides a method for the heterologous expression of a protein of 
the invention, in which expression of a protein of the invention is carried out at a 
temperature at which a toxic activity of the protein is not manifested. 

A preferred temperature is around 30°C. This is particularly suited to the expression of 919. 
10 Mutations 

As discussed above, expressed Neisserial proteins may be toxic to E.colL This toxicity can 
be avoided by mutating the protein to reduce or eliminate the toxic activity. In particular, 
mutations to reduce or eliminate toxic enzymatic activity can be used, preferably using site- 
directed mutagenesis. 

15 In a seventh approach to heterologous expression, therefore, an expressed protein is mutated 
to reduce or eliminate toxic activity. 

Thus the invention provides a method for the heterologous expression of a protein of the 
invention, in which protein is mutated to reduce or eliminate toxic activity. 

The method is preferably used for the expression of protein 907, 919 or 922. A preferred 
20 mutation in 907 is at Glu-117 (e.g. Glu->Gly); preferred mutations in 919 are at Glu-255 
(e.g. Glu->Gly) and/or Glu-323 (e.g. Glu-*Gly); preferred mutations in 922 are at Glu-164 
(e.g. Glu->Gly), Ser-213 (e.g. Ser-^Gly) and/or Asn-348 (e.g. Asn->Gly). 

Alternative vectors 

In a eighth approach to heterologous expression, an alternative vector used to express the 
25 protein. This may be to improve expression yields, for instance, or to utilise plasmids that are 
already approved for GMP use. 

Thus the invention provides a method for the heterologous expression of a protein of the 
invention, in which an alternative vector is used. The alternative vector is preferably 
pSM214, with no fusion partners. Leader peptides may or may not be included. 
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This approach is particularly useful for protein 953. Expression and localisation of 953 with 
its native leader peptide expressed from pSM214 is much better than from the pET vector. 

pSM214 may also be used with: AG287, A2-287, A3-287, A4-287, Orf46.1, 961L, 961, 
961(MC58), 961c, 961c-L, 919, 953 and AG287-Orf46.1. 

5 Another suitable vector is pET-24b (Novagen; uses kanamycin resistance), again using no 
fusion partners. pET-24b is preferred for use with: AG287K, A2-287K, A3-287K, A4-287K, 
CM46.1-K, Orf46A-K, 961-K (MC58), 961a-K, 961b-K, 961c-K, 961c-L-K, 961d-K, 
AG287-919-K, AG287-Orf46.1-K and AG287-961-K. 

Multimeric form 

10 In a ninth approach to heterologous expression, a protein is expressed or purified such that it 
adopts a particular multimeric form. 

This approach is particularly suited to protein 953. Purification of one particular multimeric 
form of 953 (the monomelic form) gives a protein with greater bactericidal activity than 
other forms (the dimeric form). 

15 Proteins 287 and 919 may be purified in dimeric forms. 

Protein 961 may be purified in a 180kDa oligomeric form (e.g. a tetramer). 

Lipidation 

In a tenth approach to heterologous expression, a protein is expressed as a lipidated protein. 

Thus the invention provides a method for the heterologous expression of a protein of the 
20 invention, in which the protein is expressed as a lipidated protein. 

This is particularly useful for the expression of 919, 287, ORF4, 406, 576-1, and ORF25. 
Polymorphic forms of 919, 287 and ORF4 are disclosed in WO00/66741. 

The method will typically involve the use of an appropriate leader peptide without using an 
N-terminal fusion partner. 

25 C-terminal deletions 

In an eleventh approach to heterologous expression, the C-terminus of a protein of the 
invention is mutated. In addition, it is preferred that no fusion partner is used. 
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Thus the invention provides a method for the heterologous expression of a protein of the 
invention, in which (a) the protein's C-terminus region is mutated and, optionally, (b) no 
fusion partner is used. 

The method will typically involve the steps of: obtaining nucleic acid encoding a protein of 
5 the invention; manipulating said nucleic acid to mutate nucleotides that encode the protein's 
C-terminus portion. The resulting nucleic acid may be inserted into an expression vector, or 
may already be part of an expression vector. The first amino acid of the expressed protein 
will be that of the mature native protein. 

The mutation may be a substitution, insertion or, preferably, a deletion. 

10 This method can increase the levels of expression, particularly for proteins 730, ORF29 and 
ORF46. For protein 730, a C-terminus region of around 65 to around 214 amino acids may 
be deleted; for ORF46, the C-terminus region of around 175 amino acids may be deleted; for 
ORF29, the C-terminus may be deleted to leave around 230-370 N-terminal amino acids. 

Leader peptide mutation 

15 In a twelfth approach to heterologous expression, the leader peptide of the protein is 
mutated. This is particularly useful for the expression of protein 919. 

Thus the invention provides a method for the heterologous expression of a protein of the 
invention, in which the protein's leader peptide is mutated. 

The method will typically involve the steps of: obtaining nucleic acid encoding a protein of 
20 the invention; and manipulating said nucleic acid to mutate nucleotides within the leader 
peptide. The resulting nucleic acid may be inserted into an expression vector, or may already 
be part of an expression vector. 

Poly-glycine deletion 

In a thirteenth approach to heterologous expression, poly-glycine stretches in wild-type 
25 sequences are mutated. This enhances protein expression. 

The poly-glycine stretch has the sequence (Gly) n , where n>4 (e.g. 5, 6, 7, 8, 9 or more). This 
stretch is mutated to disrupt or remove the (Gly) n . This may be by deletion (e.g. CGGGGS— > 
CGGGS, CGGS, CGS or CS), by substitution (e.g. CGGGGS-* CGXGGS, CGXXGS, 
CGXGXS etc.), and/or by insertion (e.g. CGGGGS— ► CGGXGGS, CGXGGGS, etc.). 
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This approach is not restricted to Neisserial proteins - it may be used for any protein 
(particularly bacterial proteins) to enhance heterologous expression. For Neisserial proteins, 
however, it is particularly suitable for expressing 287, 741, 983 and Tbp2. An alignment of 
polymorphic forms of 287 is disclosed in WO00/66741. 

5 Thus the invention provides a method for the heterologous expression of a protein of the 
invention, in which (a) a poly-glycine stretch within the protein is mutated. 

The method will typically involve the steps of: obtaining nucleic acid encoding a protein of 
the invention; and manipulating said nucleic acid to mutate nucleotides that encode a poly- 
glycine stretch within the protein sequence. The resulting nucleic acid may be inserted into 
10 an expression vector, or may already be part of an expression vector. 

Conversely,, the opposite approach (i.e. introduction of poly-glycine stretches) can be used to 
suppress or diminish expression of a given heterologous protein. 

Heterologous host 

Whilst expression of the proteins of the invention may take place in the native host (Le. the 
15 organism in which the protein is expressed in nature), the present invention utilises a 
heterologous host. The heterologous host may be prokaryotic or eukaryotic. It is preferably 
KcolU but other suitable hosts include Bacillus subtilis, Vibrio cholerae, Salmonella typhi, 
Salmonenna typhimurium, Neisseria meningitidis, Neisseria gonorrhoeae, Neisseria 
lactamica, Neisseria cinerea, Mycobateria (e.g. M.tuberculosis), yeast etc. 

20 Vectors etc. 

As well as the methods described above, the invention provides (a) nucleic acid and vectors 
useful in these methods (b) host cells containing said vectors (c) proteins expressed or 
expressable by the methods (d) compositions comprising these proteins, which may be 
suitable as vaccines, for instance, or as diagnostic reagents, or as immunogenic compositions 
25 (e) these compositions for use as medicaments (e.g. as vaccines) or as diagnostic reagents (f) 
the use of these compositions in the manufacture of (1) a medicament for treating or 
preventing infection due to Neisserial bacteria (2) a diagnostic reagent for detecting the 
presence of Neisserial bacteria or of antibodies raised against Neisserial bacteria, and/or (3) a 
reagent which can raise antibodies against Neisserial bacteria and (g) a method of treating a 
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patient, comprising admiiiistering to the patient a therapeutically effective amount of these 
compositions. 

Sequences 

The invention also provides a protein or a nucleic acid having any of the sequences set out in 
5 the following examples. It also provides proteins and nucleic acid having sequence identity 
to these. As described above, the degree of 'sequence identity' is preferably greater than 
50% (eg. 60%, 70%, 80%, 90%, 95%, 99% or more). 

Furthermore, the invention provides nucleic acid which can hybridise to the nucleic acid 
disclosed in the examples, preferably under "high stringency" conditions (eg. 65°C in a 
10 O.lxSSC, 0.5% SDS solution). 

The invention also provides nucleic acid encoding proteins according to the invention. 

It should also be appreciated that the invention provides nucleic acid comprising sequences 
complementary to those described above (eg. for antisense or probing purposes). 

Nucleic acid according to the invention can, of course, be prepared in many ways (eg. by 
15 chemical synthesis, from genomic or cDNA libraries, from the organism itself etc.) and can 
take various forms (eg. single stranded, double stranded, vectors, probes etc.). 

In addition, the term "nucleic acid" includes DNA and RNA, and also their analogues, such 
as those containing modified backbones, and also peptide nucleic acids (PNA) etc. 

BRIEF DESCRIPTION OF DRAWINGS 

20 Figures 1 and 2 show constructs used to express proteins using heterologous leader peptides. 

Figure 3 shows expression data for ORF1, and Figure 4 shows similar data for protein 961. 
Figure 5 shows domains of protein 287, and Figures 6 & 7 show deletions within domain A. 
Figure 8 shows domains of protein 564. 

Figure 9 shows the PhoC reporter gene driven by the 919 leader peptide, and Figure 10 
25 shows the results obtained using mutants of the leader peptide. 

Figure 11 shows insertion mutants of protein 730 (A: 730-C1; B: 730-C2). 

Figure 12 shows domains of protein 961. 
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Figure 13 shows SDS-PAGE of AG proteins. Dots show the main recombinant product. 



Figure 14 shows 26 hybrid proteins according to the invention. 



MODES FOR CARRYING OUT THE INVENTION 



Example 1 - 919 and its leader peptide 
5 Protein 919 from N. meningitidis (serogroup B, strain 2996) has the following sequence: 

1 MKKYLFRAAL YGIAAAILAA CQSKSIQTFP QPDTSVINGP DRPVGIPDPA 

51 GTTVGGGGAV YTWPHLSLP HWAAQDFAKS LQSFRLGCAN LKNRQGWQDV 

101 CAQAFQTPVH SFQAKQFFER YFTPWQVAGN GSLAGTVTGY YEPVLKGDDR 

151 RTAQARFPIY GIPDDFISVP LPAGLRSGKA LVRIRQTGKN SGTIDNTGGT 

10 201 HTADLSRFPI TARTTAIKGR FEGSRFLPYH TRNQINGGAL DGKAPILGYA 

251 EDPVELFFMH IQGSGRLKTP .SGKYIRIGYA DKNEHPYVSI GRYMADKGYL 

301 KLGQTSMQGI KAYMRQNPQR LAEVLGQNPS YIFFRELAGS SNDGPVGALG 

351 TPLMGEYAGA VDRHYITLGA PLFVATAHPV TRKALNRLIM AQDTGSAIKG 

401 AVRVDYFWGY GDEAGELAGK QKTTGYVWQL LPNGMKPEYR P* 

1 5 The leader peptide is underlined. 



The sequences of 919 from other strains can be found in Figures 7 and 18 of WO00/6674L 



Example 2 of WO99/57280 discloses the expression of protein 919 as a His-fusion in KcolL 
The protein is a good surface-exposed immunogen. 



Three alternative expression strategies were used for 919: 
20 1) 919 without its leader peptide (and without the mature N-terminal cysteine) and 

without any fusion partner (<919 untagged '): 

1 QSKSIQTFP QPDTSVINGP DRPVGIPDPA GTTVGGGGAV YTWPHLSLP 

50 HWAAQDFAKS LQSFRLGCAN LKNRQGWQDV CAQAFQTPVH SFQAKQFFER 

100 YFTPWQVAGN GSLAGTVTGY YEPVLKGDDR RTAQARFPIY GIPDDFISVP 

25 150 LPAGLRSGKA LVRIRQTGKN SGTIDNTGGT HTADLSRFPI TARTTAIKGR 

200 FEGSRFLPYH TRNQINGGAL DGKAPILGYA EDPVELFFMH IQGSGRLKTP 

250 SGKYIRIGYA DKNEHPYVSI GRYMADKGYL KLGQTSMQGI KAYMRQNPQR 

300 LAEVLGQNPS YIFFRELAGS SNDGPVGALG TPLMGEYAGA VDRHYITLGA 

350 PLFVATAHPV TRKALNRLIM AQDTGSAIKG AVRVDYFWGY GDEAGELAGK 

30 400 QKTTGYVWQL LPNGMKPEYR P* 



The leader peptide and cysteine were omitted by designing the 5-end amplification 
primer downstream from the predicted leader sequence. 

2) 919 with its own leader peptide but without any fusion partner (*919L > ); and 
35 3) 919 with the leader peptide (mktffktlsaaalalilaa) from ORF4 ('919LOrf4'). 

1 MKTFFKTLS AAALALILAA CQSKSIQTFP QPDTSVINGP DRPVGIPDPA 

50 GTTVGGGGAV YTWPHLSLP HWAAQDFAKS LQSFRLGCAN LKNRQGWQDV 

100 CAQAFQTPVH SFQAKQFFER YFTPWQVAGN GSLAGTVTGY YEPVLKGDDR 

150 RTAQARFPIY GIPDDFISVP LPAGLRSGKA LVRIRQTGKN SGTIDNTGGT 

40 200 HTADLSRFPI TARTTAIKGR FEGSRFLPYH TRNQINGGAL DGKAPILGYA 

250 EDPVELFFMH IQGSGRLKTP SGKYIRIGYA DKNEHPYVSI GRYMADKGYL 

300 KLGQTSMQGI KSYMRQNPQR LAEVLGQNPS YIFFRELAGS SNDGPVGALG 
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350 TPLMGEYAGA VDRHYITLGA PLFVATAHPV TRKALNRLIM AQDTGSAIKG 
400 AVRVDYFWGY GDEAGELAGK QKTTGYVWQL LPNGMKPEYR P* 

To make this construct, the entire sequence encoding the ORF4 leader peptide was 
5 included in the S'-primer as a tail (primer 919Lorf4 For). A Nhel restriction site was 

generated by a double nucleotide change in the sequence coding for the ORF4 leader 
(no amino acid changes), to allow different genes to be fused to the ORF4 leader 
peptide sequence. A stop codon was included in all the 3'-end primer sequences. 

All three forms of the protein were expressed and could be purified. 

10 The '9191/ and c 919LOrf4' expression products were both lipidated, as shown by the 
incorporation of [ 3 H]-palmitate label. 9i9 unta *ged did not incorporate the 3 H label and was 
located intracellularly. 

919LOrf4 could be purified more easily than 919L. It was purified and used to immunise 
mice. The resulting sera gave excellent results in FACS and ELISA tests, and also in the 
15 bactericidal assay. The lipoprotein was shown to be localised in the outer membrane. 

919 untagged gaye exceUent ELISA titres and high serum bactericidal activity. FACS confirmed 
its cell surface location. 

Example 2 - 919 and expression temperature 

Growth of Kcoli expressing the 919LOrf4 protein at 37°C resulted in lysis of the bacteria. In 
20 order to overcome this problem, the recombinant bacteria were grown at 30°C. Lysis was 
prevented without preventing expression. 

Example 3 - mutation of 907, 919 and 922 

It was hypothesised that proteins 907, 919 and 922 are murein hydrolases, and more 
particularly lytic transglycosylases. Murein hydrolases are located on the outer membrane 
25 and participate in the degradation of peptidoglycan. 

The purified proteins 919 untagged , 919Lorf4, 919-His (i.e. with a C-terminus His-tag) and 
922-His were thus tested for murein hydrolase activity [Ursinus & Holtje (1994) J.BacL 
176:338-343]. Two different assays were used, one determining the degradation of insoluble 
murein sacculus into soluble muropeptides and the other measuring breakdown of 
30 poly(MurNAc-GlcNAc) n>30 glycan strands. 
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The first assay uses murein sacculi radiolabelled with meso-2,6-diamino-3,4,5-[ 3 H]pimelic 
acid as substrate. Enzyme (3-10 jag total) was incubated for 45 minutes at 37°C in a total 
volume of lOOpl comprising lOmM Tris-maleate (pH 5.5), lOmM MgCl 2 , 0.2% v/v Triton 

X-100 and [ 3 H]A 2 pm labelled murein sacculi (about lOOOOcpm). The assay mixture was 

5 placed on ice for 15 minutes with 100 (jl of 1% w/v N-acetyl-N^,N-tilmethylammonium for 
15 minutes and precipitated material pelleted by centrifugation at lOOOOg for 15 minutes. 
The radioactivity in the supernatant was measured by liquid scintillation counting. Kcoli 
soluble lytic transglycosylase Slt70 was used as a positive control for the assay; the negative 
control comprised the above assay solution without enzyme. 

10 All proteins except 919-His gave positive results in the first assay. 

The second assay monitors the hydrolysis of poly(MurNAc-GlcNAc)glycan strands. Purified 
strands, poly(MurNAc-GlcNAc)n>3o labelled with N-acetyl-D-l-[ 3 H]glucosamine were 
incubated with 3jag of 919L in 10 mM Tris-maleate (pH 5.5), 10 mM MgCl 2 and 0.2% v/v 
Triton X-100 for 30 min at 37°C. The reaction was stopped by boiling for 5 minutes and the 
15 pH of the sample adjusted to about 3.5 by addition of lOjil of 20% v/v phosphoric acid. 
Substrate and product were separated by reversed phase HPLC on a Nucleosil 300 C 18 
column as described by Harz et. al [Anal Biochem. (1990) 190:120-128]. The Kcoli lytic 
transglycosylase Mlt A was used as a positive control in the assay. The negative control was 
performed in the absence of enzyme. 

20 By this assay, the ability of 919LOrf4 to hydrolyse isolated glycan strands was demonstrated 
when anhydrodisaccharide subunits were separated from the oligosaccharide by HPLC. 

Protein 919Lorf4 was chosen for kinetic analyses. The activity of 919Lorf4 was enhanced 
3.7-fold by the addition of 0.2% v/v Triton X-100 in the assay buffer. The presence of Triton 
X-100 had no effect on the activity of 919™*^. The effect of pH on enzyme activity was 

25 determined in Tris-Maleate buffer over a range of 5.0 to 8.0. The optimal pH for the reaction 
was determined to be 5.5. Over the temperature range 18°C to 42°C, maximum activity was 
observed at 37°C. The effect of various ions on murein hydrolase activity was determined by 
performing the reaction in the presence of a variety of ions at a final concentration of lOmM. 
Maximum activity was found with Mg 2+ , which stimulated activity 2.1 -fold. Mn 2+ and Ca 2 + 

30 also stimulated enzyme activity to a similar extent while the addition Ni 2+ and EDTA had no 
significant effect. In contrast, both Fe 2+ and Zn 2+ significantly inhibited enzyme activity. 
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The structures of the reaction products resulting from the digestion of unlabelled Exoli 
murein sacculus were analysed by reversed-phase HPLC as described by Glauner [Anal 
Biochem. (1988) 172:451-464]. Murein sacculi digested with the muramidase Cellosyl were 
used to calibrate and standardise the Hypersil ODS column. The major reaction products 
5 were 1,6 anhydrodisaccharide tetra and tri peptides, demonstrating the formation of 1,6 
anhydromuraminic acid intramolecular bond. 

These results demonstrate experimentally that 919 is a murein hydrolase and in particular a 
member of the lytic transglycosylase family of enzymes. Furthermore the ability of 922-His 
to hydrolyse murein sacculi suggests this protein is also a lytic transglycosylase. 

10 This activity may help to explain the toxic effects of 919 when expressed in E.colL 

In order to eliminate the enzymatic activity, rational mutagenesis was used. 907, 919 and 
922 show fairly low homology to three membrane-bound lipidated murein lytic 
transglycosylases from E.coli: 

919 (441aa) is 27.3% identical over 440aa overlap to Exoli MLTA (P46885); 
15 922 (369aa) is 38.7% identical over 3 lOaa overlap to Exoli MLTB (P41052); and 

907-2 (207aa) is 26.8% identical over 149aa overlap to Exoli MLTC (P52066). 

907-2 also shares homology with Exoli MLTD (P23931) and Slt70 (P03810), a soluble lytic 
transglycosylase that is located in the periplasmic space. No significant sequence homology 
can be detected among 919, 922 and 907-2, and the same is true among the corresponding 
20 "MLTA, MLTB and MLTC proteins. 

Crystal structures are available for Slt70 [1QTEA; 1QTEB; Thunnissen et al (1995) 
Biochemistry 34: 12729-12737] and for Slt35 [1LTM; 1QUS; 1QUT; van Asselt et al (1999) 
Structure Fold Des 7:1 167-80] which is a soluble form of the 40kDa MLTB. 

The catalytic residue (a glutamic acid) has been identified for both Slt70 and MLTB. 

25 In the case of SH70, mutagenesis studies have demonstrated that even a conservative 
substitution of the catalytic Glu505 with a glutamine (Gin) causes the complete loss of 
enzymatic activity. Although Slt35 has no obvious sequence similarity to Slt70, their 
catalytic domains shows a surprising similarity. The corresponding catalytic residue in 
MLTB is Glul62. 
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Another residue which is believed to play an important role in the correct folding of the 
enzymatic cleft is a well-conserved glycine (Gly) downstream of the glutamic acid. 
Recently, Terrak et al [MoLMicrobioL (1999) 34:350-64] have suggested the presence of 
another important residue which is an aromatic amino acid located around 70-75 residues 
5 downstream of the catalytic glutamic acid. 

Sequence alignment of Slt70 with 907-2 and of MLTB with 922 were performed in order to 
identify the corresponding catalytic residues in the MenB antigens. 

The two alignments in the region of the catalytic domain are reported below: 

907-2/Slt70: 

10 ■ 4 90 100 110 T120 130 140 

907-2 .pep ERRRLLVN I QYE S SRAG- - LDTQ IVLGL IEVE S AFRQYAI SGVGARGLMQVMPFWKNYIG 

Mil:: H = * : llh : I ill Tl I I : I I - 

slty_ecoli ERFPLAYNDLFKRYTSGKEI PQSYAMAI ARQESAWNPKVKS PVGASGLMQIMPGTATHTV 
480 490 500 k 510 520 530 

15 GI-U505 

922/MLTB 

150 160 T 170 180 190 200 

922. pep VAQKYGVPAEIlIVAVIGIETNYGK^P^GSFRVADAIATLGFDYPRRAGFFQKELTOLLKIiA 

20 : I Mil hlhOhll "J: I = h I I I I I h I = I I I I I =1= II =1 l\ 

ml tb_ec Ol l AWQVYGVPPE I IVGI IGVETRWGRVMGKTRILDALATLSFNYPRRAEYFSGEIiETFLI^ 
150 160 A 170 180 190 200 

GLU162 

25 210 220 230 240 250 260 

92 2 . pep KEEGGDVFAFKGSYAGAMGMPQFMPSCTRKWAVDYTC 

::| I : Hlhllllh I I I I I I I ::: I I I :: I I I I ::| I h :||llhl 
mltb^ecoli RDEQDDPLNLKGSFAGAMGYGQFMPSSYKQYAVDFSGDGHINLWDPV^DAIGSVANYFKA 
210 220 230 240 250 260 

30 

From these alignments, it results that the corresponding catalytic glutamate in 907-2 is 
Glull7, whereas in 922 is Glul64. Both antigens also share downstream glycines that could 
have a structural role in the folding of the enzymatic cleft (in bold), and 922 has a conserved 
aromatic residue around 70aa downstream (in bold). 

35 In the case of protein 919, no 3D structure is available for its E.coli homologue MLTA, and 
nothing is known about a possible catalytic residue. Nevertheless, three amino acids in 919 
are predicted as catalytic residues by alignment with MLTA: 

919/MLTA 

240 250 V 260 □ □ 270 □ 280 290 

40 919 .pep ALDGKAPILGYAEDPVELFFMHIQGSGRLKTPSGKYIRI -GYADKNEHPYVS IGRYMADK 

II' I Ihh:: hi HIM : :|: s : :|| II I I llh : 1 = 
mlta_ecoli .p ALSDKY-ILAYSNSLMDNFIMDVQGSGYIDFGDGSPLNFFSYAGKNGHAYRSIGK\^ 

170 180 190 200 210 
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300 310 320 ▼ 330D CD 340 0350 0 

919 .pep GYLKLGQTSMQGIKSYMRQN^^ 

I :| : llhh :::::: hi ||||::||: : : II II ::||:| 

mlta_ecoli .p GEVKKEDMSMQAIRHWGETHSEAEVRELLEQNPSFVFFKPQSFA PVKGASAVPLVG 

5 220 230 240 250 260 270 

360 T o 380 390 400 00410 

919. pep EYAGAVDRHY I TLGAPLFVATAHPVTRKALN RLIMAQDTGSAIKGAVRVDYFWGY 

: : I II I Is |:: : :| ||::| hhllil : | : | 

10 mlta_ecoli.p RASVASDRSIIPPGTOLLAEVPLIxDNNGKFN^ 

280 290 300 310 320 330 

420 o 
9 1 9 . pep GDEAGEIiAGKQKTTGYVWQLLP 

15 I |||: || : | || | 

mlta_ecoli.p GPEAGHRAGWYNHYGRVWVLKT 
340 350 

The three possible catalytic residues are shown by the symbol T : 

20 1) Glu255 (Asp in MLTA), followed by three conserved glycines (Gly263, Gly265 and 
Gly272) and three conserved aromatic residues located approximately 75-77 residues 
downstream. These downstream residues are shown by □. 

2) Glu323 (conserved in MLTA), followed by 2 conserved glycines (Gly347 and Gly355) 
and two conserved aromatic residues located 84-85 residues downstream (Tyr406 or 

25 Phe407). These downstream residues are shown by 0. 

3) Asp362 (instead of the expected Glu), followed by one glycine (Gly 369) and a 
conserved aromatic residue (Trp428). These downstream residues are shown by o. 

Alignments of polymorphic forms of 919 are disclosed in WO00/66741. 

Based on the prediction of catalytic residues, three mutants of the 919 and one mutant of 
30 907, containing each a single amino acid substitution, have been generated. The glutamic 
acids in position 255 and 323 and the aspartic acids in position 362 of the 919 protein and 
the glutamic acid in position 1 17 of the 907 protein, were replaced with glycine residues 
using PCR-based SDM. To do this, internal primers containing a codon change from Glu or 
Asp to Gly were designed: 



WO 01/64922 


-19- 


PCT/IB01/00452 


Primers 


Sequences 


Codon change 


919-E255 for 
919-E255 rev 


CGAAGACCCCGTCGgtCll 1 11 1 llATG 
GTGCATAAAAAAAAGacCGACGGGGTCT 


GAA-> Ggt 


919-E323 for 
919-E323 rev 


AACGCCTCGCCGgtGl 1 TlGGGTCA 
TTTGACCCAAAACacCGGCGAGGCG 


GAA-> Ggt 


919-D362 for 
9l9-D362rev 


TGCCGGCGCAGTCGgtCGGCACTACA 
TAATGTAGTGCCGacCGACTGCGCCG 


GAC -> Ggt 


907 -hi 17 for 
907-E117rev 


TGATTGAGGTGGgtAGCGCGTTCCG 
GGCGGAACGCGCTacCCACCTCAAT 


GAA -» Ggt 



Underlined nucleotides code for glycine; the mutated nucleotides are in lower case. 



To generate the 919-E255, 919-E323 and 919-E362 mutants, PCR was performed using 
20ng of the pET 919-LCM4 DNA as template, and the following primer pairs: 
1) Orf4L for / 919-E255 rev 
5 2)919-E255for/919Lrev 

3) Orf4L for / 919-E323 rev 

4) 919-E323for/919Lrev 

5) 0rf4L for / 91 9-D362 rev 

6) 919-D362for/919Lrev 

10 The second round of PCR was performed using the product of PCR 1-2, 3-4 or 5-6 as 
template, and as forward and reverse primers the "Orf4L for" and "919L rev" respectively. 

For the mutant 907-E117, PCR have been performed using 200ng of chromosomal DNA of 
the 2996 strain as template and the following primer pairs: 

7) 907Lfor/907-E117rev 
15 8)907-E117for/907Lrev 

The second round of PCR was performed using the products of PCR 7 and 8 as templates 
and the oligos "907L for" and "907L rev" as primers. 

The PCR fragments containing each mutation were processed following the standard 
procedure, digested with Ndel and Xhol restriction enzymes and cloned into pET-21b+ 
20 vector. The presence of each mutation was confirmed by sequence analysis. 

Mutation of Glull7 to Gly in 907 is carried out similarly, as is mutation of residues Glul64, 
Ser213 and Asn348 in 922. 
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The E255G mutant of 919 shows a 50% reduction in activity; the E323G mutant shows a 
70% reduction in activity; the E362G mutant shows no reduction in activity. 

Example 4 - multimeric form 

287-GST, 919 untagged and 953-His were subjected to gel filtration for analysis of quaternary 
5 structure or preparative purposes. The molecular weight of the native proteins was estimated 
using either FPLC Superose 12 (H/R 10/30) or Superdex 75 gel filtration columns 
(Pharmacia). The buffers used for chromatography for 287, 919 and 953 were 50 mM Tris- 
HC1 (pH 8.0), 20 mM Bicine (pH 8.5) and 50 mM Bicine (pH 8.0), respectively. 

Additionally each buffer contained 150-200 mM NaCl and 10% v/v glycerol. Proteins were 
10 dialysed against the appropriate buffer and applied in a volume of 200pL Gel filtration was 
performed with a flow rate of 0.5 - 2.0 ml/min and the eluate monitored at 280nm. Fractions 
were collected and analysed by SDS-PAGE. Blue dextran 2000 and the molecular weight 
standards ribonuclease A, chymotrypsin A ovalbumin, albumin (Pharmacia) were used to 
calibrate the column. The molecular weight of the sample was estimated from a calibration 
15 curve of vs. log M r of the standards. Before gel filtration, 287-GST was digested with 

thrombin to cleave the GST moiety. 

The estimated molecular weights for 287, 919 and 953-His were 73 kDa, 47 kDa and 43 kDa 
respectively. These results suggest 919 is monomeric while both 287 and 953 are principally 
dimeric in their nature. In the case of 953-His, two peaks were observed during gel filtration. 
20 The major peak (80%) represented a dimeric conformation of 953 while the minor peak 
(20%) had the expected size of a monomer. The monomeric form of 953 was found to have 
greater bactericidal activity than the dimer. 

Example 5 -pSM214 and pET-24b vectors 

953 protein with its native leader peptide and no fusion partners was expressed from the pET 
25 vector and also from pSM214 [Velati Bellini et al (1991) J. Biotechnol 18, 177-192]. 

The 953 sequence was cloned as a full-length gene into pSM214 using the £. coli MM294-1 
strain as a host. To do this, the entire DNA sequence of the 953 gene (from ATG to the 
STOP codon) was amplified by PCR using the following primers: 

953L for/2 CCG GAATTC TTATGAAAAAAATCATCTTCGCCGC Eco RI 

30 953L rev/2 GCCCAAGCTTTTATTGTTTGGCTGCCTCGATT Hind HI 
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which contain EcoRI and HindUL restriction sites, respectively. The amplified fragment was 
digested with EcoRI and HindSL and ligated with the pSM214 vector digested with the same 
two enzymes. The ligated plasmid was transformed into Exoli MM294-1 cells (by 
incubation in ice for 65 minutes at 37° C) and bacterial cells plated on LB agar containing 
5 20ng/ml of chloramphenicol. 

Recombinant colonies were grown over-night at 37°C in 4 ml of LB broth containing 20 
jig/ml of chloramphenicol; bacterial cells were centrifuged and plasmid DNA extracted as 
and analysed by restriction with EcoRI and HindLIl. To analyse the ability of the 
recombinant colonies to express the protein, they were inoculated in LB broth containing 
10 20p.g/ml of chloramphenicol and let to grown for 16 hours at 37°C. Bacterial cells were 
centrifuged and resuspended in PBS. Expression of the protein was analysed by SDS-PAGE 
and Coomassie Blue staining. 

Expression levels were unexpectedly high from the pSM214 plasmid. 



Oligos used to clone sequences into pSM-214 vectors were as follows: 



AG287 
(pSM-214) 


Fwd 


CCGGAATTCrTATG-TCGCCCGATGTTAAATCGGCGGA 


EcoRI 


Rev 


GCCCAAGCTT-TCAATCCTGCTCl'l I'l'l'lGCCG 


HindlH 


A2287 
(pSM-214) 


Fwd 


CCGGAATTCTTATG-AGCCAAGATATGGCGGCAGT 


EcoRI 


Rev 


GCCCAAGCTT-TCAATCCTGCTCTTTTTTGCCG 


Hindffl 


A3 287 
(pSM-214) 


Fwd 


CCGGAATTCTTATG-TCCGCCGAATCCGCAAATCA 


EcoRI 


Rev 


GCCCAAGCTT-TCAATCCTGCrcTTTTTTGCCG 


Hindffl 


A4287 
(pSM-214) 


Fwd 


CCGGAATTCTTATG-GGAAGGGTTGATTTGGCTAATG 


EcoRI 


Rev 


GCCCAAGCTT-TCAATCCTGCTCTTTTTTGCCG 


Hindffl 


Orf46.1 
(pSM-214) 


Fwd 


CCGGAATTCTTATG-TCAGATTTGGCAAACGATTCTT 


EcoRI 


Rev 


GCCCAAGCTr-TTACGTATCATATTrCACGTGCTTC 


Hindffl 


AG287-Orf46.1 

(pSM-214) 


Fwd 


CCGGAATTCTTATG-TCGCCCGATGTTAAATCGGCGGA 


EcoRI 


Rev 


GCCCAAGCTT-TTACGTATCATATTTCACGTGCTTC 


Hindffl 


919 
(pSM-214) 


Fwd 


CCGGAATTCTTATG-CAAAGCAAGAGCATCCAAACCT 


EcoRI 


Rev 


GCCCAAGCTT-TTACGGGCGGTATTCGGGCT 


Hindffl 


961L 
(pSM-214) 


Fwd 


CCGGAATTCATATG-AAACACTTTCCATCC 


EcoRI 


Rev 


GCCCAAGCTT-TTACCACTCGTAATTGAC 


Hindffl 


961 
(pSM-214) 


Fwd 


CCGGAATTCATATG-GCCACAAGCGACGAC 


EcoRI 


Rev 


GCCCAAGCTT-TTACCACTCGTAATTGAC 


Hindffl 


961c L 
pSM-214 


Fwd 


CCGGAATTCTTATG-AAACACTTTCCATCC 


EcoRI 


Rev 


GCCCAAGCTT-TCAACCCACGTTGTAAGGTTG 


Hindffl 


961c 
pSM-214 


Fwd 


CCGGAATTCTTATG-GCCACAAACGACGACG 


EcoRI 


Rev 


GCCCAAGCTT-TCAACCCACGTTGTAAGGTTG 


Hindffl 


953 
(pSM-214) 


Fwd 


CCGGAATTCTTATG-GCCACCTACAAAGTGGACGA 


EcoRI 


Rev 


GC(XAAGCTT-TTATTGTTTGGCTGCCTCGATT 


Hindffl 
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These sequences were manipulated, cloned and expressed as described for 953L. 

For the pET-24 vector, sequences were cloned and the proteins expressed in pET-24 as 
described below for pET21. pET2 has the same sequence as pET-21, but with the kanamycin 
resistance cassette instead of ampicillin cassette. 



Oligonucleotides used to clone sequences into pET-24b vector were: 



AG 287 K 


Fwd 


CGCGGATCCGCTAGC-CCCGATGTTAAATCGGC § 


Nhel 


Rev 


CCCGCTCGAG-TCAATCCTGCTCTITnTGCC * 


Xhol 


A2 287 K 


Fwd 


CGCGGATCCGCTAGC-CAAGATATGGCGGCAGT 5 


Nhel 


A3 287 K 


Fwd 


CGCGGATCCGCTAGC-GCCGAATCCGCAAATCA 5 


Nhel 

A lUvX 




A W VJ 


PGPGPTAGP-GGA AGGGTTGATTTGGCTA ATGG 5 

V^VJV^VJV-' J. VTiVJV^~VJVJ.rTL/Tk.VJVJVJ A A VJiTk AAA VJ VJ V^ A ruXl VJ VJ 


NheT 

A 1 All/A 


Orf46.1 K 


Fwd 


UUUA/\ X 1 v^v*/\ a J\ 1 vJ-vJvJv^/Y 111 vA^vAJv^ArvrvrV 1/\aL» 


iNiiei 


JVC V 


CCCCtCTCG A G-TT A CGT ATP ATA TTTP A PCrTGP 

V- V^V^VJV^ A V-VJAvJ* X J. rxV^VJ 1 T\ A V-./TL A f\ AAA V^/VV-VJ A VJV^ 


yVHUA 


Orf46AK 


rwa 


\J\3\Ji\i\ 1 1 v^L/V 1 /\1 vJ-VJvJV^rV 111 v^v^v^vjX^/\/Vrvr\ 1A1L 


iNaei 


AVCV 


CCCnCTCCr A G-TT A TTPT ATGCCTTCrTnCClCiC A T 

V-/ V-V^VJ 1 V^VJAU" A X /\ 1 IV^lnl VJ V^V^ 1 A VJ 1 VJ V^VJVJVx^V 1 


YhnT 


961 K 
(MC58) 


Fwd 


CCtCCiCi ATPPPATATfl-HPPAPA AnrnArrj AfnA 

vJ VJ/\ 1 V»A_,v^r\ 1/11 vJ-vJv^v^/\v^/V/\vjv^O/\v^vJ/\v^O/\ 


INUCl 


x\.cv 


PPPfiPTPGAG-TT ACC APTPGTA ATTGAP 

V^V^V^VJV-* A VyVJAVJ" X X AvLnVrf A V^VJ A t\r\ A A VJxTlV-» 


YVlIUA 


961a K 


rwu 


phpp^ atppp at ath-oppap a a aphapo 

V^VJV>VJVJ/x 1 V^V^V-»/\ A /A A VJ-VJV^V^/A.V^r*/\rtV^Vj/\wVJ 


AN ltd 


Rev 


CCCGCTCGAG-TCATTTAGCAATATTATC1TTGTTC 


Xhol 


961b K 


Fwd 


CGCGGATCCCATATG-AAAGCAAACAGTGCCGAC 


Ndel 


Rev 


CCCGCTCGAG-TTACCACTCGTAATTGAC 


Xhol 


961c K 


Fwd 


CGCGGATCCCATATG-GCCACAAACGACG 


Ndel 


Rev 


CCCGCTCGAG-TTAACCCACGTTGTAAGGT 


Xhol 


961cL K 


Fwd 


CGCGGATCCCATATG-ATGAAACACTTTCCATCC 


Ndel 


Rev 


CCCGCTCGAG-TTAACCCACGTTGTAAGGT 


Xhol 


961dK 


Fwd 


CGCGGATCCCATATG-GCCACAAACGACG 


Ndel 


Rev 


CCCGCTCGAG-TCAGTCTGACACTGTTTTATCC 


Xhol 


AG 287- 
919 K 


Fwd 


CGCGGATCCGCTAGC-CCCGATGTTAAATCGGC 


Nhel 


Rev 


CCCGCTCGAG-TTACGGGCGGTATTCGG 


Xhol 


AG 287- 
Orf46.1 K 


Fwd 


CGCGGATCCGCTAGC-CCCGATGTTAAATCGGC 


Nhel 


Rev 


CCCGCTCGAG-TTACGTATCATATTTCACGTGC 


Xhol 


AG 287- 
961 K 


Fwd 


CGCGGATCCGCTAGC-CCCGATGTTAAATCGGC 


Nhel 


Rev 


CCCGCTCGAG-TTACCACTCGTAATTGAC 


Xhol 



* This primer was used as a Reverse primer for all the 287 forms. 
§ Forward primers used in combination with the AG278 K reverse primer. 

Example 6 - ORF1 and its leader peptide 

ORF1 from N. meningitidis (serogroup B, strain MC58) is predicted to be an outer membrane 
or secreted protein. It has the following sequence: 

1 MKTTDKRTTE THRKAPKTGR IRTSPAYLAI CLSFGILPQA WAGHTYFGIN 



WO 01/64922 



PCT/IB01/00452 



-23- 



10 



15 



20 



25 



51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 
601 
651 
701 
751 
801 
851 
901 
951 
1001 
1051 
1101 
1151 
1201 
1251 
1301 
1351 
1401 
1451 



YQYYRDFAEN 

VAALVGDQYI 

KGHPYGGDYH 

RQYWRSDEDE 

KHSPYGFLPT 

QLVRKDWFYD 

LPNRLKTRTV 

GKGELILTSN 

VNGVANDRLS 

FSEIGLVSGR. 

DEGAMIVNHN 

TTKTNGRLNL 

YNHLNDHWSQ 

VKGDWHLSNH 

LTKTDISGNV 

SLVGNAQATF 

HSALNGNVSL 

GNLNLDNATI 

SVESRFNTLT 

NTGNEPASLB 

EFRIiHNPVKE 

VAEPARQAGG 

ARRARRDLPQ 

RVFAEDRRNA 

GILFSHNRTE 

SLSDGIGGKI 

ENVNIATPGL 

TRVNTAVLAQ 

IKLGYRW* 



KGKFAVGAKD 
VSVAHNGGYN 
MPRLHKFVTD 
PNNRESSYHI 
GGSFGDSGSP 
EIFAGDTHSV 
QLFNVSLSET 
INQGAGGLYF 
KIGKGTLHVQ 
GTVQLNADNQ 
QDKESTVTIT 
VYQPAAEDRT 
KEGIPRGEIV 
AQAVFGVAPH 
DLADHAHLNL 
NQATLNGNTS 
ADKAVFHFES 
TLNSAYRHDA 
VNGKLNGQGT 
QLTWEGKDN 
QELSDKLGKA 
ENVGIMQAEE 
LQPQPQPQPQ 
VWTSGIRDTK 
NTFDDGIGNS 
RRRVLHYGIQ 
AFNRYRAGIK 
DFGKTRSAEW 



IEVYNKKGEL 
NVDFGAEGRN 
AEPVEWTSYM 
ASAYSWL VGG 
MFIYDAQKQK 
FYEPRQNGKY 
AREPVYHAAG 
QGDFTVSPEN 
AKGENQGSIS 
FNPDKLYFGF 
GNKDIATTGN 
LLLSGGTNLN 
WDNDWXNRTF 
QSHTICTRSD 
TGLATLNGNL 
ASGNASFNLS 
SRFTGQISGG 
AGAQTGSATD 
FRFMSELFGY 
KPLSENLNFT 
EAKKQAEKDN 
EKKRVQADKD 
RDLISRYANS 
HYRSQDFRAY 
ARLAHGAVFG 
ARYRAGFGGF 
ADYSFKPAQH 
GVNAEIKGFT 



VGKSMTKAPM 
PDQHRFTYKI 
DGRKYIDQNN 
NTFAQNGSGG 
WLINGVLQTG 
SFNDDNNGTG 
GVNSYRPRLN 
NETWQGAGVH 
VGDGTVILDQ 
RGGRIiDLNGH 
NNSLDSKKEI 
GNITQTNGKL 
KAENFQIKGG 
WTGLTNCVEK 
SANGDTRYTV 
DHAVQNGSLT 
KDTALHLKDS 
APRRRSRRSR 
RSDKLKLAES 
LQNEHVDAGA 
AQSLDALIAA 
TALAKQREAE 
GLSEFSATLN 
RQQTDLRQIG 
QYGIDRFYIG 
GIEPHIGATR 
ISITPYLSLS 
LSLHAAAAKG 



IDFSWSRNG 
VKRNNYKAGT 
YPDRVRIGAG 
GTVNLGSEKI 
NPYIGKSNGF 
KINAKHEHNS 
NGENISFIDE 
ISEDSTVTWK 
QADDKGKKQA 
SLSFHRIQNT 
AYNGWFGEKD 
FFSGRPTPHA 
QAWSRNVAK 
TITDDKVIAS 
SHNATQNGNL 
LSGNAKANVS 
EWTLPSGTEL 
RSLLSVTPPT 
SEGTYTLAVN 
WRYQLIRKDG 
GRDAVEKTES 
TRPATTAFPR 
SVFAVQDELD 
MQKNLGSGRV 
ISAGAGFSSG 
YFVQKADYRY 
YTDAASGKVR 
PQLEAQHSAG 



30 The leader peptide is underlined. 

A polymorphic form of ORF1 is disclosed in W099/55873. 



Three expression strategies have been used for ORF1: 

1) ORF1 using a ffis tag, following W099/24578 (ORFl-His); 

2) ORF1 with its own leader peptide but without any fusion partner ('ORF1L'); and 

35 3) ORF1 with the leader peptide (mkktaiaiavalagfatvaqa) from Exoli OmpA 

('OrflLOmpA'): 

MKKTAIAIAVAIAGFATVAQAASAGHTYFGI 

VSRNGVAALVGIXJYIVSVAHNGGYNNVDFGAEGRNPIXJHRF^^ 

PVEMTSYMX5RKYIDQNOTPDRTOIGA 
40 IKHSPYGFLPTGGSFGDSGSPMFIYDAQKQKWLINGVIOTGNPYIGKSNGFQIiVRKDWFYDE 

NGKYSFNDDNNGTGKINAKHEHNSLPNRLKTRWQLFWSLSETAREPVYHAAG SF IDEGKG 

ELILTSNINQGAGGLYFQGDFWSPENNETWQGAGVHISEDSTVTWKVNGVAND^ 

VGDGWILDQQADDKGRKQAFSEIGLVSGRGTVQLNADNQFNPDK^^ 

NHNQDKESTVTITGNKDIATTGNNNSLDSKKEIA 
45 QTNGKLFF SGRPT PHAYNHLNDHWSQKEG I PRGEIVWLNDWINRTFKAENFQIKGGQAWSRWAKVKGDWHLSNHA 

QAWGVAPHQSHT ICTRSDWTGLTNCVEKT ITDDKVIASLTKTDI SGNVDLADHAHLNLTGLATLNGNL SANGDTRY 

WSHNATQNGNLSLVGNAQATFNQATLNGOTSASGNASF^ 

FHFESSRFTGQISGGKOTALHLKDSEWTLPSGTE 

LLSVTPPTSVESRFNTLTVNGKLNGQGTFRFMSELFGYRSDKLK^ 
50 NKPLSENLNFTLQNEHVDAGAWRYQLIRKIX5EFR 

KTESVAEPARQAGGEWGIMQAEEEKKRVQADKDTALAKQREAETRPATTAFP 

I SRYANSGL SEF SATLNSVFAVQDELDRVFAEDRRNAVWT SGIRDTKHYRSQDFRAYRQQTDLRQ IGMQKNLGSGRV 
GILFSHNRTENTFDDGIGNSARLAHGAVFGQYGIDRFYIGI SAGAGFS SGSLSDGIGGKIRRRVIjHYGIQARYRAGF 
GGFGIEPHIGATRYFVQKADYRYENVNIATPGLAFNRYRAGIKADYSFKPAQHI S ITPYLSLSYTDAASGKVRTRVN 
55 T AVLAQDFGKTRS AEWGVNAE I KGFTL SLHAAAAKG PQLEAQHSAG IKLGYRW * 
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To make this construct, the clone pET91 lLOmpA (see below) was digested with the 
Nhel and Xhol restriction enzymes and the fragment corresponding to the vector 
carrying the OmpA leader sequence was purified (pETLOmpA). The ORF1 gene 
coding for the mature protein was amplified using the oligonucleotides ORFl-For 
5 and ORFl-Rev (including the Nhel and Xhol restriction sites, respectively), digested 

with Nhel and XIiol and ligated to the purified pETOmpA fragment (see Figure 1). 
An additional AS dipeptide was introduced by the Nhel site. 

All three forms of the protein were expressed. The His-tagged protein could be purified and 
was confirmed as surface exposed, and possibly secreted (see Figure 3). The protein was 
10 used to immunise mice, and the resulting sera gave excellent results in the bactericidal assay. 

ORFlLOmpA was purified as total membranes, and was localised in both the inner and 
outer membranes. Unexpectedly, sera raised against ORFlLOmpA show even better ELISA 
and anti-bactericidal properties than those raised against the His-tagged protein. 

ORF1L was purified as outer membranes, where it is localised. 

15 Example 7 - protein 911 and its leader peptide 

Protein 911 from Nmeningitidis (serogroup B, strain MC58) has the following sequence: 

1 MKKNILEFWV GLFVLIGAAA VAFLAFRVAG GAAFGGSDKT YAVYADFGDI 

51 GGLKVNAPVK SAGVLVGRVG AIGLDPKSYQ ARVRLDLDGK YQPSSDVSAQ 

101 ILTSGLLGEQ YIGLQQGGDT ENLAAGDTIS VTSSAMVLEN LIGKFMTSFA 

20 151 EKNADGGNAE KAAE* 

The leader peptide is underlined. 

Three expression strategies have been used for 91 1: 

1) 911 with its own leader peptide but without any fusion partner ( '9 1 1L'); 

2) 911 with the leader peptide from Rcoli OmpA ('91 lLOmpA'). 

25 To make this construct, the entire sequence encoding the OmpA leader peptide was 

included in the 5'- primer as a tail (primer 911LOmpA Forward). A Nliel restriction 
site was inserted between the sequence coding for the OmpA leader peptide and the 
911 gene encoding the predicted mature protein (insertion of one amino acid, a 
serine), to allow the use of this construct to clone different genes downstream the 

30 OmpA leader peptide sequence. 

3) 911 with the leader peptide (mkyllptaaaglllaaqpama) from Erwinia carotovora 
PelB('911LpelB'). 
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To make this construct, the 5'-end PCR primer was designed downstream from the 
leader sequence and included the Ncol restriction site in order to have the 911 fused 
directly to the PelB leader sequence; the 3'- end primer included the STOP codon. 
The expression vector used was pET22b+ (Novagen), which carries the coding 
5 sequence for the PelB leader peptide. The Ncol site introduces an additional 

methionine after the PelB sequence. 

All three forms of the protein were expressed. ELISA titres were highest using 91 1L, with 
919LOmpA also giving good results. 

Example 8 -ORF46 

10 The complete ORF46 protein from N. meningitidis (serogroup B, strain 2996) has the 
following sequence: 

1 LGISRKISLI LSILAVCLPM HAHASDLAND SFIRQVLDRQ HFEPDGKYHL 

51 FGSRGELAER SGHIGLGKIQ SHQLGNLMIQ QAAIKGNIGY IVRFSDHGHE 

101 VHSPFDNHAS HSDSDEAGSP VDGFSLYRIH WDGYEHHPAD GYDGPQGGGY 

15 151 PAPKGARDIY SYDIKGVAQN IRLNLTDNRS TGQRLADRFH NAGSMLTQGV 

201 GDGFKRATRY SPELDRSGNA AEAFNGTADI VKNIIGAAGE IVGAGDAVQG 

251 ISEGSNIAVM HGLGLLSTEN KMARINDLAD MAQLKDYAAA AIRDWAVQNP 

301 NAAQGIEAVS NIFMAAIPIK GIGAVRGKYG LGGITAHPIK RSQMGAIALP 

351 KGKSAVSDNF ADAAYAKYPS PYHSRNIRSN LEQRYGKENI TSSTVPPSNG 

20 401 KNVKLADQRH PKTGVPFDGK GFPNFEKHVK YDTKLDIQEL SGGGIPKAKP 

451 VSDAKPRWEV DRKLNKLTTR EQVEKNVQEI RNGNKNSNFS QHAQLEREIN 

501 KLKSADEINF ADGMGKFTDS MNDKAFSRLV KSVKENGFTN PWEYVEING 

551 KAYIVRGNNR VFAAEYLGRI HELKFKKVDF PVPNTSWKNP TDVLNESGNV 

601 KRPRYRSK* 

25 

The leader peptide is underlined. 

The sequences of ORF46 from other strains can be found in WOOO/66741. 

Three expression strategies have been used for ORF46: 

1) ORF46 with its own leader peptide but without any fusion partner ('ORF46-2L'); 
30 2) ORF46 without its leader peptide and without any fusion partner CORF46-2'), with 

the leader peptide omitted by designing the 5-end amplification primer downstream 
from the predicted leader sequence: 

1 SDLANDSFIR QVLDRQHFEP DGKYHLFGSR GELAERSGHI GLGKIQSHQL 

51 GNLMIQQAAI KGNIGYIVRF SDHGHEVHSP FDNHASHSDS DEAGSPVDGF 

35 101 SLYRIHWDGY EHHPADGYDG PQGGGYPAPK GARDIYSYDI KGVAQNIRLN 

151 DTDNRSTGQR LADRFHNAGS MLTQGVGDGF KRATRYSPEL DRSGNAAEAF 

201 NGTADIVKNI IGAAGEXVGA GDAVQGISEG SNIAVMHGLG LLSTENKMAR 

251 INDLADMAQL KDYAAAAIRD WAVQNPNAAQ GIEAVSNIFM AAIPIKGIGA 

301 VRGKYGLGGI TAHPIKRSQM GAIALPKGKS AVSDNFADAA YAKYPSPYHS 

40 351 RNIRSNLEQR YGKENITSST VPPSNGKNVK LADQRHPKTG VPFDGKGFPN 

401 FEKHVKYDTK LDIQELSGGG IPKAKPVSDA KPRWEVDRKL NKLTTREQVE 

451 KNVQEIRNGN KNSNFSQHAQ LEREINKLKS ADEINFADGM GKFTDSMNDK 

501 AFSRLVKSVK ENGFTNFWB YVEINGKAYI VRGNNRVFAA EYLGRIHELK 

551 FKKVDFPVPN TSWKNPTDVL NESGNVKRPR YRSK* 



WO 01/64922 



-26- 



PCT/IB01/00452 



3) ORF46 as a truncated protein, consisting of the first 433 amino acids ('ORF46.1L'), 
constructed by designing PCR primers to amplify a partial sequence corresponding 
to aa 1-433. 

5 A STOP codon was included in the 3'-end primer sequences. 

ORF46-2L is expressed at a very low level to E.coli. Removal of its leader peptide 
(ORF46-2) does not solve this problem. The truncated ORF46.1L form (first 433 amino 
acids, which are well conserved between serogroups and species), however, is 
well-expressed and gives excellent results in ELIS A test and in the bactericidal assay. 

10 ORF46.1 has also been used as the basis of hybrid proteins. It has been fused with 287, 919, 
and ORF1. The hybrid proteins were generally insoluble, but gave some good ELIS A and 
bactericidal results (against the homologous 2996 strain): 



Protein 


ELISA 


Bactericidal Ab 


Orfl-Orf46.1-His 


850 


256 


919-Orf46.1-His 


12900 


512 


919-287-Orf46-His 


n.d. 


n.d. 


Orf46.1-287His 


150 


8192 


Orf46.1-919His 


2800 


2048 


Orf46.1-287-919His 


3200 


16384 



For comparison, 'triple' hybrids of ORF46.1, 287 (either as a GST fiision, or in AG287 
form) and 919 were constructed and tested against various strains (including the homologous 
15 2996 strain) versus a simple mixture of the three antigens. FCA was used as adjuvant: 





2996 


BZ232 


MC58 


NGH38 


F6124 


BZ133 


Mixture 


8192 


256 


512 


1024 


>2048 


>2048 


ORF46.1-287-919his 


16384 


256 


4096 


8192 


8192 


8192 


AG287-919-ORF46.1his 


8192 


64 


4096 


8192 


8192 


16384 


AG287-ORF46.1-919his 


4096 


128 


256 


8192 


512 


1024 



Again, the hybrids show equivalent or superior immunological activity. 



Hybrids of two proteins (strain 2996) were compared to the individual proteins against 
various heterologous strains: 
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1000 


MC58 


F6124 (MenA) 


ORF46.1-His 


<4 


4096 


<4 


ORFl-His 


8 


256 


128 


ORF1— ORF46.1-His 


1024 


512 


1024 



Again, the hybrid shows equivalent or superior immunological activity. 



Example 9 - protein 961 

The complete 961 protein from N. meningitidis (serogroup B, strain MC58) has the following 
sequence: 

5 1 MSMKHFPAKV LTTAILATFC SGALAA TSDD DVKKAATVAI VAAYNNGQEI 

51 NGFKAGETIY DIGEDGTITQ KDATAADVEA DDFKGLGLKK WTNLTKTVN 
101 ENKQNVDAKV KAAESEIEKL TTKLADTDAA LADTDAALDE TTOALNKLGE 
151 NITTFAEETK TNIVKIDEKL EAVADTVDKH AEAFNDIADS LDETNTKADE 
201 AVKTANEAKQ TAEETKQNVD AKVKAAETAA GKAEAAAGTA NTAADKAEAV 
10 251 AAKVTDIKAD IATNKADIAK NSARIDSLDK NVANLRKETR QGLAEQAALS 

301 GLFQPYNVGR FNVTAAVGGY KSESAVAIGT GFRFTENFAA KAGVAVGTSS 
351 GSSAAYHVGV NYEW* 

The leader peptide is underlined. 

15 Three approaches to 961 expression were used: 

1) 961 using a GST fusion, following WO99/57280 ('GST961'); 

2) 961 with its own leader peptide but without any fusion partner ('961L'); and 

3) 961 without its leader peptide and without any fusion partner ( € 961 iml,S8Bd ')» with the 
leader peptide omitted by designing the 5 f -end PCR primer downstream from the 

20 predicted leader sequence. 

All three forms of the protein were expressed. The GST-fusion protein could be purified and 
antibodies against it confirmed that 961 is surface exposed (Figure 4). The protein was used 
to immunise mice, and the resulting sera gave excellent results in the bactericidal assay. 
961L could also be purified and gave very high ELIS A titres. 

25 Protein 961 appears to be phase variable. Furthermore, it is not found in all strains of 
N [meningitidis. 

Example 10 - protein 287 

Protein 287 from N. meningitidis (serogroup B, strain 2996) has the following sequence: 

1 MFBRSVIAMA CIFALSA CGG GGGGSPDVKS ADTLSKPAAP WAEKETEVK 

30 51 EDAPQAGSQG QGAPSTQGSQ DMAAVSAENT GNGGAATTDK PKNEDEGPQN 

101 DMPQNSAESA NQTGNNQPAD SSDSAPASNP APANGGSNFG RVDLANGVLI 

151 DGPSQNITLT HCKGDSCNGD NLLDEEAPSK SEFENLNESE RIEKYKKDGK 
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201 SDKFTNLVAT AVQANGTNKY VIIYKDKSAS SSSARFRRSA RSRRSLPAEM 

251 PLIPVNQADT LIVDGEAVSL TGHSGNIFAP EGNYRYLTYG AEKLPGGSYA 

301 LRVQGEPAKG EMLAGTAVYN GEVLHFHTEN GRPYPTRGRF AAKVDFGSKS 

351 VDGIIDSGDD LHMGTQKFKA AIDGNGFKGT WTENGGGDVS GRFYGPAGEE 

5 401 VAGKYSYRPT DAEKGGFGVF AGKKEQD* 

The leader peptide is shown underlined. 

The sequences of 287 from other strains can be found in Figures 5 and 15 of WO00/66741. 
Example 9 of WO99/57280 discloses the expression of 287 as a GST-fusion in E.colL 

10 A number of further approaches to expressing 287 in Exoli have been used, including: 

1) 287asaffis-taggedfusion( t 287-ffis'); 

2) 287 with its own leader peptide but without any fusion partner (*287L'); 

3) 287 with the ORF4 leader peptide and without any fusion partner (*287LOrf4'); and 

4) 287 without its leader peptide and without any fusion partner (<287™ tegged *): 

15 1 CGGGGGGSPD VKSADTLSKP AAPWAEKET EVKEDAPQAG SQGQGAPSTQ 

51 GSQDMAAVSA ENTGNGGAAT TDKPKNEDEG PQNDMPQNSA ESANQTGNNQ 

101 PADSSDSAPA SNPAPANGGS NFGRVDLANG VLIDGPSQNI TLTHCKGDSC 

151 NGDNLLDEEA PSKSEFENLN ESERIEKYKK DGKSDKFTNL VATAVQANGT 

201 NKYVIIYKDK SASSSSARFR RSARSRRSLP AEMPLIFVNQ ADTLIVDGEA 

20 251 VSLTGHSGNI FAPEGNYRYL TYGAEKLPGG SYALRVQGEP AKGEMLAGTA 

301 VYNGEVLHFH TENGRPYPTR GRFAAKVDFG SKSVDGIIDS GDDLHMGTQK 

351 FKAAIDGNGF KGTWTENGGG DVSGRFYGPA GEEVAGKYSY RPTDAEKGGF 

401 GVFAGKKEQD * 

25 All these proteins could be expressed and purified. 

'287L' and '287LOrf4' were confirmed as lipoproteins. 

As shown in Figure 2, '287LOrf4* was constructed by digesting 919LOrf4 with Nhel and 
Xhol. The entire ORF4 leader peptide was restored by the addition of a DNA sequence 
coding for the missing amino acids, as a tail, in the 5' -end primer (287LOrf4 for), fused to 
30 287 coding sequence. The 287 gene coding for the mature protein was amplified using the 
oligonucleotides 287LOrf4 For and Rev (including the Nhel and Xhol sites, respectively), 
digested with Nhel and Xhol and ligated to the purified pETOrf4 fragment 

Example 11 - further non-fusion proteins with/without native leader peptides 

A similar approach was adopted for Exoli expression of further proteins from W099/24578, 
35 W099/36544 and WO99/57280. 
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The following were expressed without a fusion partner: 008, 105, 117-1, 121-1, 122-1, 128- 
1, 148, 216, 243, 308, 593, 652, 726, 982, and Orfl43-l. Protein 117-1 was confirmed as 
surface-exposed by FACS and gave high ELISA titres. 

The following were expressed with the native leader peptide but without a fusion partner: 
5 111, 149, 206, 225-1, 235, 247-1, 274, 283, 286, 292, 401, 406, 502-1, 503, 519-1, 525-1, 
552, 556, 557, 570, 576-1, 580, 583, 664, 759, 907, 913, 920-1, 926, 936-1, 953, 961, 983, 
989, Orf4, Orf7-l, Orf9-l, Orf23, Orf25, Orf37, Qrf38, Orf40, Orf40.1, Orf40.2, Orf72-l, 
Orf76-l, CM85-2, Orf91, Orf97-l, Orfll9, Orf 143.1. These proteins are given the suffix 'L\ 

His-tagged protein 760 was expressed with and without its leader peptide. The deletion of 
10 the signal peptide greatly increased expression levels. The protein could be purified most 
easily using 2M urea for solubilisation. 

His-tagged protein 264 was well-expressed using its own signal peptide, and the 30kDa 
protein gave positive Western blot results. 

All proteins were successfully expressed. 

15 The localisation of 593, 121-1, 128-1, 593, 726, and 982 in the cytoplasm was confirmed. 

The localisation of 920-1L, 953L, ORF9-1L, ORF85-2L, ORF97-1L, 570L, 580L and 664L 
in the periplasm was confirmed. 

The localisation of ORF40L in the outer membrane, and 008 and 519-1L in the inner 
membrane was confirmed. ORF25L, ORF4L, 406L, 576-1L were all confirmed as being 
20 localised in the membrane. 

Protein 206 was found not to be a lipoprotein. 

ORF25 and ORF40 expressed with their native leader peptides but without fusion partners, 
and protein 593 expressed without its native leader peptide and without a fusion partner, 
raised good anti-bactericidal sera. Surprisingly, the forms of ORF25 and ORF40 expressed 
25 without fusion partners and using their own leader peptides (Le. 'ORF25L' and 'ORF40L') 
give better results in the bactericidal assay than the fusion proteins. 

Proteins 920L and 953L were subjected to N-terminal sequencing, giving hrvwetah and 
atykvdeyhanarfaf, respectively. This sequencing confirms that the predicted leader 
peptides were cleaved and, when combined with the periplasmic location, confirms that the 
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proteins are correctly processed and localised by Exoli when expressed from their native 
leader peptides. 

The N-terminal sequence of protein 519.1L localised in the inner membrane was meffiilla, 
indicating that the leader sequence is not cleaved. It may therefore function as both an 
5 uncleaved leader sequence and a transmembrane anchor in a manner similar to the leader 
peptide of PBP1 from N.gonorrhoeae [Ropp & Nicholas (1997) /. Bact. 179:2783-2787.]. 
Indeed the N-terminal region exhibits strong hydrophobic character and is predicted by the 
Tmpred. program to be transmembrane. 

Example 12 - lipoproteins 

10 The incoiporation of palmitate in recombinant lipoproteins was demonstrated by the method 
of Kraft et al [J. Bact. (1998) 180:3441-3447.]. Single colonies harbouring the plasmid of 
interest were grown overnight at 37°C in 20 ml of LB/Amp (lOOpg/ml) liquid culture. The 
culture was diluted to an OD550 of 0.1 in 5.0 ml of fresh medium LB/Amp medium 
containing 5 jnC/ml [ 3 H] palmitate (Amersham). When the OD550 of the culture reached 0.4- 

15 0.8, recombinant lipoprotein was induced for 1 hour with IPTG (final concentration 1.0 
mM). Bacteria were harvested by centrifugation in a bench top centrifuge at 2700g for 15 
min and washed twice with 1.0 ml cold PBS. Cells were resuspended in 120pl of 20 mM 
Tris-HCl (pH 8.0), 1 mM EDTA, 1.0% w/v SDS and lysed by boiling for 10 min. After 
centrifugation at 13000g for 10 min the supernatant was collected and proteins precipitated 

20 by the addition of 1.2 ml cold acetone and left for 1 hour at -20 °C. Protein was pelleted by 
centrifugation at 13000g for 10 min and resuspended in 20-50pl (calculated to standardise 
loading with respect to the final O.D of the culture) of 1.0% w/v SDS. An aliquot of 15 |jl 
was boiled with 5^1 of SDS-PAGE sample buffer and analysed by SDS-PAGE. After 
electrophoresis gels were fixed for 1 hour in 10% v/v acetic acid and soaked for 30 minutes 

25 in Amplify solution (Amersham). The gel was vacuum-dried under heat and exposed to 
Hyperfilm (Kodak) overnight -80 °C. 

Incorporation of the [ 3 H] palmitate label, confirming lipidation, was found for the following 
proteins: Orf4L, Orf25L, 287L, 287LCM4, 406.L, 576L, 926L, 919L and 919LOrf4. 

Example 13 - domains in 287 

30 Based on homology of different regions of 287 to proteins that belong to different functional 
classes, it was split into three 'domains', as shown in Figure 5. The second domain shows 
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homology to IgA proteases, and the third domain shows homology to transferrin-binding 
proteins. 

Each of the three 'domains' shows a different degree of sequence conservation between 
N. meningitidis strains - domain C is 98% identical, domain A is 83% identical, whilst 
5 domain B is only 71% identical. Note that protein 287 in strain MC58 is 61 amino acids 
longer than that of strain 2996. An alignment of the two sequences is shown in Figure 7, and 
alignments for various strains are disclosed in WO00/66741 (see Figures 5 and 15 therein). 

The three domains were expressed individually as C-terminal His-tagged proteins. This was 
done for the MC58 and 2996 strains, using the following constructs: 

10 287a-MC58 (aa 1-202), 287b-MC58 (aa 203-288), 287c-MC58 (aa 311-488). 

287a-2996 (aa 1-139), 287b-2996 (aa 140-225), 287c-2996 (aa 250-427). 

To make these constructs, the stop codon sequence was omitted in the 3* -end primer 
sequence. The 5' primers included the Nhel restriction site, and the 3' primers included a 
Xhol as a tail, in order to direct the cloning of each amplified fragment into the expression 
1 5 vector pET2 lb+ using Ndel-Xhol, Nhel-Xhol or Ndel-HindHl restriction sites. 

All six constructs could be expressed, but 287b-MC8 required denaturation and refolding for 
solubilisation. 

Deletion of domain A is described below ('A4 287-His'). 

Immunological data (serum bactericidal assay) were also obtained using the various domains 
20 from strain 2996, against the homologous and heterologous MenB strains, as well as MenA 
(F6124 strain) andMenC (BZ133 strain): 





2996 


BZ232 


MC58 


NGH38 


394/98 


MenA 


MenC 


287-His 


32000 


16 


4096 


4096, 


512 


8000 


16000 


287(B)-His 


256 










16 




287(C)-His 


256 




32 


512 


32 


2048 


>2048 


287(B-C)-His 


64000 


128 


4096 


64000 


1024 


64000 


32000 



Using the domains of strain MC58, the following results were obtained: . 
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MC58 


2996 


BZ232 


NGH38 


394/98 


MenA ' 


MenC 


287-His 


4096 


32000 


16 


4096 


512 


8000 


16000 


287(B)-His 


128 


128 










128 


287(Q-His 




16 




1024 




512 




287(B-C)-His 


16000 


64000 


128 


64000 


512 


64000 


>8000 



Example 14 - deletions in 287 

As well as expressing individual domains, 287 was also expressed (as a C~terminal 
His-tagged protein) by making progressive deletions within the first domain. These 

Four deletion mutants of protein 287 from strain 2996 were used (Figure 6): 
5 1) '287-His', consisting of amino acids 18-427 (Le. leader peptide deleted); 

2) c Al 287-His 5 , consisting of amino acids 26-427; 

3) C A2 287-His', consisting of amino acids 70-427; 

4) 'A3 287-His', consisting of amino acids 107-427; and 

5) 'A4 287-His 5 , consisting of amino acids 140-427 (=287-bc). 

10 The 'A4 5 protein was also made for strain MC58 (* A4 287MC58-His 5 ; aa 203-488). 

The constructs were made in the same way as 287a/b/c, as described above. 

All six constructs could be expressed and protein could be purified. Expression of 287-His 
was, however, quite poor. 

Expression was also high when the C-terminal His-tags were omitted. 

15 Immunological data (serum bactericidal assay) were also obtained using the deletion 
mutants, against the homologous (2996) and heterologous MenB strains, as well as MenA 
(F6124 strain) and MenC (BZ133 strain): 





2996 


BZ232 


MC58 


NGH38 


394/98 


MenA 


MenC 


287-his 


32000 


16 


4096 


4096 


512 


8000 


16000 


Al 287-His 


16000 


128 


4096 


4096 


1024 


8000 


16000 


A2 287 His 


16000 


128 


4096 


>2048 


.512 


16000 


>8000 


A3 287-His 


16000 


128 


4096 


>2048 


512 


16000 


>8000 


A4 287-His 


64000 


128 


4096 


64000 


1024 


64000 


32000 



The same high activity for the A4 deletion was seen using the sequence from strain MC58. 
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As well as showing superior expression characteristics, therefore, the mutants are 
immunologically equivalent or superior. 

Example 15 - poly-glycine deletions 

The 'Al 287-His' construct of the previous example differs from 287-His and from 
5 <287 untagged ' only by a short N-terminal deletion (GGGGGGS). Using an expression vector 
which replaces the deleted serine with a codon present in the Nhe cloning site, however, this 
amounts to a deletion only of (Gly)6. Thus, the deletion of this (Gly>6 sequence has been 
shown to have a dramatic effect on protein expression. 

The protein lacking the N-terminal amino acids up to GGGGGG is called 'AG 287'. In strain 
10 MC58, its sequence (leader peptide underlined) is: 

. AG287 

1 MFKRSVIAMA CIFALSAC GG GGGGSPDVKS ADTLSKPAAP WSEKETEAK 

51 EDAPQAGSQG QGAPSAQGSQ DMAAVSEENT GNGGAVTADN PKNEDEVAQN 

101 DMPQNAAGTD SSTPNHTPDP NMLAGNMENQ ATDAGESSQP ANQPDMANAA 

15 151 DGMQGDDPSA GGQNAGNTAA QGANQAGNNQ AAGSSDPIPA SNPAPANGGS 

201 NFGRVDLANG VLIDGPSQNI TLTHCKGDSC SGNNFLDEEV QLKSEFEKLS 

251 DADKI SNYKK DGKNDKFVGL VADSVQMKGI NQYIIFYKPK PTSFARFRRS 

301 ARSRRSLPAE MPLIPVNQAD TLIVDGEAVS LTGHSGNIFA PEGNYRYLTY 

351 GAEKLPGGSY ALRVQGEPAK GEMLAGAAVY NGEVLHFHTE NGRPYPTRGR 

20 401 FAAKVDFGSK SVDGIIDSGD DLHMGTQKFK AAIDGNGFKG TWTENGSGDV 

451 SGKFYGPAGE EVAGKYSYRP TDAEKGGFGV FAGKKEQD* 

AG287, with or without His-tag ('AG287-His' and 'AG287K', respectively), are expressed at 
very good levels in comparison with the '287-His* or '287 BB * -, ( 

25 On the basis of gene variability data, variants of AG287-His were expressed in Kcoli from a 
number of MenB strains, in particular from strains 2996, MC58, 1000, and BZ232. The 
results were also good. 

It was hypothesised that poly-Gly deletion might be a general strategy to improve 
expression. Other MenB lipoproteins containing similar (Gly)„ motifs (near the N-terminus, 
30 downstream of a cysteine) were therefore identified, namely Tbp2 (NMB0460), 741 (NMB 
1870) and 983 (NMB1969): 

TBP2 r+ AGTbp2 

1 MNNPLVNQAA MVLPVFLLSA CLGGGGSFDL DSVDTEAPRP APKYQDVFSE 

51 KPQAQKDQGG YGFAMRUCRR NWYPQAKEDE VKLDESDWEA TGLPDEPKEL 

35 101 PKRQKSVIEK VETDSDNNIY SSPYLKPSNH QNGNTGNGIN QPKNQAKDYE 

151 NFKYVYSGWF YKHAKREFNL KVEPKSAKNG DDGYIFYHGK EPSRQLPASG t 

201 KITYKGVWHF ATDTKKGQKF REIIQPSKSQ GDRYSGFSGD DGEEYSNKNK 

251 STLTDGQEGY GFTSNLEVDF HNKKLTGKLI RNNANTDNNQ ATTTQYYSLE 

301 AQVTGNRFNG KATATDKPQQ NSETKEHPFV SDSSShSGGF FGPQGEELGF 

40 351 RFLSDDQKVA WGSAKTKDK PANGNTAAAS GGTDAAASNG AAGTSSENGK 

401 LTTVLDAVEL KLGDKEVQKL DNFSNAAQLV VDGIMIPLLP EASESGNNQA 

451 NQGTNGGTAF TRKFDHTPES DKKDAQAGTQ TNGAQTASOT AGDTNGKTKT 
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501 YEVEVCCSNL NYLKYGMTjTR KNSRSAMQAG ESSSQADAKT EQVEQSMFLQ 

551 GERTDEKEIP SEQNIVYRGS WYGYIANDKS TSWSGNASNA TSGNRAEFTV 

601 NFADKKITGT LTADNRQEAT FTIDGNIKDN GFEGTAKTAE SGFDLDQSNT 

651 TRTPRAYITD AKVQGGFYGP KAEELGGWFA YPGDKQTKNA TNASGNSSAT 

701 WFGAKRQQP TO* 



741 ^ AG741 

1 VNRTAFCCLS LTTALILTAC SSGGGGVAAD IGAGLADALT APLDHKDKGL 
51 QShTLDQSVR KNEKLKLAAQ GAEKTYGNGD SLNTGKLKND KVSRFDFIRQ 
101 lEVDGQLITIi ESGEFQVYKQ SHSALTAFQT EQIQDSEHSG KMVAKRQFRI 
151 GDIAGEHTSF DKLPEGGRAT YRGTAFGSDD AGGKLTYTID FAAKQGNGKI 
201 EHLKSPELNV DLAAADIKPD GKRHAVISGS VLYNQAEKGS YSLGIFGGKA 
251 QEVAGSAEVK TVNGIRHIGIi AAKQ* 

983 AG983 

1 MRTTPTFPTK TFKPTAMALA VATTLSACLG GGGGGTSAPD FNAGGTGIGS 
51 NSRATTAKSA AVSYAGIKNE MCKDRSMLCA GRDDVAVTDR DAKINAPPPN 
101 LHTGDFPNPN DAYKNLINLK PAIEAGYTGR GVEVGIVDTG ESVGSISFPE 
151 LYGRKEHGYN ENYKNYTAYM RKEAPEDGGG KDIEASFDDE AVIETEAKPT 
201 DIRHVKEIGH IDLVSH1IGG RSVDGRPAGG IAPDATLHIM NTNDETKNEM 
251 MVAAIRNAWV KLGERGVRIV NNSFGTTSRA GTADLFQIAN SEEQYRQALL 
301 DYSGGDKTDE GIRLMQQSDY GNLSYHIRNK NMLFIFSTGN DAQAQPNTYA 
351 LLPFYEKDAQ KGIITVAGVD RSGEKFKREM YGEPGTEPLE YGSNHCGITA 
401 MWCLSAPYEA SVRFTRTNPI QIAGTSFSAP IVTGTAALLL QKYPWMSNDN 
451 LRTTLLTTAQ DIGAVGVDSK FGWGLLDAGK AMNGPASFPF GDFTADTKGT 
501 SDIAYSFRND ISGTGGLIKK GGSQLQLHGN NTYTGKTI IE GGSLVLYGNN 
551 KSDMRVETKG ALIYNGAASG GSLNSDGIVY LADTDQSGAN ETVHIKGSLQ 
601 LDGKGTLYTR LGKLLKVDGT AIIGGKLYMS ARGKGAGYLN STGRRVPFLS 
651 AAKIGQDYSF FTNIETDGGL LASLDSVEKT AGSEGOTLSY YVRRGNAART 
701 ASAAAHSAPA GLKHAVEQGG SNLENLMVEL DASESSATPE TVETAAADRT 
751 DMPGIRPYGA TFRAAAAVQH ANAADGVRIF NSLAATVYAD STAAHAEMQG 
801 RRLKAVSDGL DHNGTGLRVI AQTQQDGGTW EQGGVEGKMR GSTQTVGIAA 
851 KTGENTTAAA TLGMGRSTWS ENSANAKTDS ISLFAGIRHD AGDIGYLKGL 
901 FSYGRYKNSI SRSTGADEHA EGSVNGTLMQ LGALGGVNVP FAATGDLTVE 
951 GGLRYDLLKQ DAFAEKGSAIi GWSGNSLTEG TLVGIiAGLKL SQP1»SDKAVL 

1001 FATAGVERDL NGRDYTVTGG FTGATAATGK TGARNMPHTR LVAGLGADVE 

1051 FGNGWNGLAR YSYAGSKQYG NHSGRVGVGY RF* 

Tbp2 and 741 genes were from strain MC58; 983 and 287 genes were from strain 2996. 
These were cloned in pET vector and expressed in Rcoli without the sequence coding for 
their leader peptides or as "AG forms", both fused to a C-terminal His~tag. In each case, the 
same effect was seen - expression was good in the clones carrying the deletion of the 
poly-glycine stretch, and poor or absent if the glycines were present in the expressed protein: 
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ORF 


Express. 


Purification 


Bact Activity 


287-His(2996) 


+/- 


+ 


+ 


'287 88 (2996) 


+A 


nd 


nd 


AG287-His(2996) 


+ 


4- 


+ 


AG287K(2996) 


■f 


+ 


+ 


AG287-His(MC58) 


+ 


+ 


+ 


AG287-ffis(1000) 


+ 


+ 




AG287-ffis(BZ232) 


+ 






Tbp2-His(MC58) 


+/- 


nd 


nd 


AGTbp2-His(MC58) 


+ 


+ 




741-ffis(MC58) 


+/- 


nd 


nd 


AG741-His(MC58) 


+ 


+ 




983-His (2996) 








AG983-His (2996) 


+ 







SDS-PAGE of the proteins is shown in Figure 13. 



AG287 and hybrids 

AG287 proteins were made and purified for strains MC58, 1000 and BZ232, Each of these 
gave high ELISA titres and also serum bactericidal titres of >8192. AG287K, expressed from 
pET-24b, gave excellent titres in ELISA and the serum bactericidal assay. 
AG287-ORF46.1K may also be expressed in pET-24b. 

AG287 was also fused directly in-frame upstream of 919, 953, 961 (sequences shown below) 
and ORF46.1: 



AG287-919 

1 ATGGCTAGCC CCGATGTTAA ATCGGCGGAC ACGCTGTCAA AACCGGCCGC 

51 TCCTGTTGTT GCTGAAAAAG AGACAGAGGT AAAAGAAGAT GCGCCACAGG 

101 CAGGTTCTCA AGGACAGGGC GCGCCATCCA CACAAGGCAG CCAAGATATG 

151 GCGGCAGTTT CGGCAGAAAA TACAGGCAAT GGCGGTGCGG CAACAACGGA 

201 CAAACCCAAA AATGAAGACG AGGGACCGCA AAATGATATG CCGCAAAATT 

251 CCGCCGAATC CGCAAATCAA ACAGGGAACA ACCAACCCGC CGATTCTTCA 

301 GATTCCGCCC CCGCGTCAAA CCCTGCACCT GCGAATGGCG GTAGCAATTT 

351 TGGAAGGGTT GATTTGGCTA ATGGCGTTTT GATTGATGGG CCGTCGCAAA 

401 ATATAACGTT GACCCACTGT AAAGGCGAOT CTTCTAATGG TGATAATTTA 

451 TTGGATGAAG AAGCACCGTC AAAATCAGAA TTTGAAAATT TAAATGAGTC 

501 TGAACGAATT GAGAAATATA AGAAAGATGG GAAAAGCGAT AAATTTACTA 

551 ATTTGGTTGC GACAGCAGTT CAAGCTAATG GAACTAACAA ATATGTCATC 

601 ATTTATAAAG ACAAGTCCGC TTCATCTTCA TCTGCGCGAT TCAGGCGTTC 

651 TGCACGGTCG AGGAGGTCGC TTCCTGCCGA GATGCCGCTA ATCCCCGTCA 

701 ATCAGGCGGA TACGCTGATT GTCGATGGGG AAGCGGTCAG CCTGACGGGG 

751 CATTCCGGCA ATATCTTCGC GCCCGAAGGG AATTACCGGT ATCTGACTTA 

801 CGGGGCGGAA AAATTGCCCG GCGGATCGTA TGCCCTCCGT GTGCAAGGCG 

851 AACCGGCAAA AGGCGAAATG CTTGCTGGCA CGGCCGTGTA CAACGGCGAA 

901 GTGCTGCATT TTCATACGGA AAACGGCCGT CCGTACCCGA CTAGAGGCAG 

951 GTTTGCCGCA AAAGTCGATT TCGGCAGCAA ATCTGTGGAC GGCATTATCG 

1001 ACAGCGGCGA TGATTTGCAT ATGGGTACGC AAAAATTCAA AGCCGCCATC 
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30 



1051 
1101 
1151 
1201 
1251 
1301 
1351 
1401 
1451 
1501 
1551 
1601 
1651 
1701 
1751 
1801 
1851 
1901 
1951 
2001 
2051 
2101 
2151 
2201 
2251 
2301 
2351 
2401 
2451 
2501 



GATGGAAACG 
TTCCGGAAGG 
GCTATCGCCC 
AAAAAAGAGC 
CCAAACCTTT 
CGGTCGGCAT 
GTCTATACCG 
TTTCGCCAAA 
ACCGCCAAGG 
CATTCCTTTC 
GGTTGCAGGC 
CGGTGCTGAA 
TACGGTATTC 
GAGCGGAAAA 
CAATCGACAA 
ATCACCGCGC 
CCTCCCCTAC 
AAGCCCCGAT 
CACATCCAAG 
CATCGGCTAT 
ATATGGCGGA 
ATCAAAGCCT 
TCAAAACCCC 
ACGGTCCCGT 
GCAGTCGACC 
CGCCCATCCG 
ATACCGGCAG 
TACGGCGACG 
CGTCTGGCAG 
TCGAG 



GCTTTAAGGG 
TTTTACGGCC 
GACAGATGCG 
AGGATGGATC 
CCGCAACCCG 
CCCCGACCCC 
TTGTACCGCA 
AGCCTGCAAT 
CTGGCAGGAT 
AGGCAAAACA 
AACGGAAGCC 
GGGCGACGAC 
CCGACGATTT 
GCCCTTGTCC 
TACCGGCGGC 
GCACAACGGC 
CACACGCGCA 
ACTCGGTTAC 
GCTCGGGCCG 
GCCGACAAAA 
CAAAGGCTAC 
ATATGCGGCA 
AGCTATATCT 
CGGCGCACTG 
GGCACTACAT 
GTTACCCGCA 
CGCGATTAAA 
AAGCCGGCGA 
CTCCTACCCA 



GACTTGGACG 
CGGCCGGCGA 
GAAAAGGGCG 
CGGAGGAGGA 
ACACATCCGT 
GCCGGAACGA 
CCTGTCCCTG 
CCTTCCGCCT 
GTGTGCGCCC 
GTTTTTTGAA 
TTGCCGGTAC 
AGGCGGACGG 
TATCTCCGTC 
GCATCAGGCA 
ACACATACCG 
AATCAAAGGC 
ACCAAATCAA 
GCCGAAGACC 
TCTGAAAACC 
ACGAACATCC 
CTCAAGCTCG 
AAATCCGCAA 
TTTTCCGCGA 
GGCACGCCGT 
TACCTTGGGC 
AAGCCCTCAA 
GGCGCGGTGC 
ACTTGCCGGC 
ACGGTATGAA 



GAAAATGGCG 
GGAAGTGGCG 
GATTCGGCGT 
GGATGCCAAA 
CATCAACGGC 
CGGTCGGCGG 
CCCCACTGGG 
CGGCTGCGCC 
AAGCCTTTCA 
CGCTATTTCA 
GGTTACCGGC 
CACAAGCCCG 
CCCCTGCCTG 
GACGGGAAAA 
CCGACCTCTC 
AGGTTTGAAG 
CGGCGGCGCG 
CCGTCGAACT 
CCGTCCGGCA 
CTACGTTTCC 
GGCAGACCTC 
CGCCTCGCCG 
GCTTGCCGGA 
TGATGGGGGA 
GCGCCCTTAT 
CCGCCTGATT 
GCGTGGATTA 
AAACAGAAAA 
GCCCGAATAC 



GCGGGGATGT 
GGAAAATACA 
GTTTGCCGGC 
GCAAGAGCAT 
CCGGACCGGC 
CGGCGGGGCC 
CGGCGCAGGA 
AATTTGAAAA 
AACCCCCGTC 
CGCCGTGGCA 
TATTACGAGC 
CTTCCCGATT 
CCGGTTTGCG 
AACAGCGGCA 
CCGATTCCCC 
GAAGCCGCTT 
CTTGACGGCA 
TTTTTTTATG 
AATACATCCG 
ATCGGACGCT 
GATGCAGGGC 
AAGTTTTGGG 
AGCAGCAATG 
ATATGCCGGC 
TTGTCGCCAC 
ATGGCGCAGG 
TTTTTGGGGA 
CCACGGGTTA 
CGCCCGTAAC 



35 



40 



45 



50 



55 



60 



65 



1 


MASPDVKSAD 


TLSKPAAPW 


51 


AAVSAENTGN 


GGAATTDKPK 


101 


DSAPASNPAP 


ANGGSNFGRV 


151 


LDEEAPSKSE 


FENLNESERI 


201 


IYKDKSASSS 


SARFRRSARS 


251 


HSGNIFAPEG 


NYRYLTYGAE 


301 


VLHFHTENGR 


PYPTRGRFAA 


351 


DGNGFKGTWT 


ENGGGDVSGR 


401 


KKEQDGSGGG 


GCQSKSIQTF 


451 


VYTWPHLSL 


PHWAAQDFAK 


501 


HSFQAKQFFE 


RYFTPWQVAG 


551 


YGIPDDFISV 


PIiPAGLRSGK 


601 


ITARTTAIKG 


RFEGSRFLPY 


651 


HIQGSGRLKT 


PSGKYIRIGY 


701 


IKAYMRQNPQ 


RLAEVLGQNP 


751 


AVDRHYITLG 


APLFVATAHP 


801 


YGDEAGELAG 


KQKTTGYVWQ 



AG287— 953 



1 


ATGGCTAGCC 


CCGATGTTAA 


51 


TCCTGTTGTT 


GCTGAAAAAG 


101 


CAGGTTCTCA 


AGGACAGGGC 


151 


GCGGCAGTTT 


CGGCAGAAAA 


201 


CAAACCCAAA 


AATGAAGACG 


251 


CCGCCGAATC 


CGCAAATCAA 


301 


GATTCCGCCC 


CCGCGTCAAA 


351 


TGGAAGGGTT 


GATTTGGCTA 


401 


ATATAACGTT 


GACCCACTGT 


451 


TTGGATGAAG 


AAGCACCGTC 


501 


TGAACGAATT 


GAGAAATATA 


551 


ATTTGGTTGC 


GACAGCAGTT 


601 


ATTTATAAAG 


ACAAGTCCGC 


651 


TGCACGGTCG 


AGGAGGTCGC 


701 


ATCAGGCGGA 


TACGCTGATT 


751 


CATTCCGGCA 


ATATCTTCGC 



AEKETEVKED 
NEDEGPQNDM 
DLANGVLIDG 
EKYKKDGKSD 
RRSLPAEMPL 
KLPGGSYAIiR 
KVDFGSKSVD 
FYGPAGEEVA 
PQPDTSVING 
SLQSFRLGCA 
NGSLAGTVTG 
ALVRIRQTGK 
HTRNQINGGA 
ADKNEHPYVS 
SYIFFRELAG 
VTRKALNRLI 
LLPNGMKPEY 



ATCGGCGGAC 
AGACAGAGGT 
GCGCCATCCA 
TACAGGCAAT 
AGGGACCGCA 
ACAGGGAACA 
CCCTGCACCT 
ATGGCGTTTT 
AAAGGCGATT 
AAAATCAGAA 
AGAAAGATGG 
CAAGCTAATG 
TTCATCTTCA 
TTCCTGCCGA 
GTCGATGGGG 
GCCCGAAGGG 



APQAGSQGQG 
PQNSAESANQ 
PSQNITLTHC 
KFTNLVATAV 
IPVNQADTLI 
VQGEPAKGEM 
GIIDSGDDLH 
GKYSYRPTDA 
PDRPVGIPDP 
NLKNRQGWQD 
YYEPVLKGDD 
NSGTIDNTGG 
LDGKAPILGY 
IGRYMADKGY 
SSNDGPVGAL 
MAQDTGSAIK 
RP* 



ACGCTGTCAA 
AAAAGAAGAT 
CACAAGGCAG 
GGCGGTGCGG 
AAATGATATG 
ACCAACCCGC 
GCGAATGGCG 
GATTGATGGG 
CTTGTAATGG 
TTTGAAAATT 
GAAAAGCGAT 
GAACTAACAA 
TCTGCGCGAT 
GATGCCGCTA 
AAGCGGTCAG 
AATTACCGGT 



APSTQGSQDM 
TGNNQPADSS 
KGDSCNGDNL 
QANGTNKYVI 
VDGEAVSLTG 
LAGTAVYNGE 
MGTQKFKAAI 
EKGGFGVFAG 
AGTTVGGGGA 
VCAQAFQTPV 
RRTAQARFPI 
THTADLSRFP 
AEDPVELFFM 
LKLGQTSMQG 
GTPLMGEYAG 
GAVRVDYFWG 



AACCGGCCGC 
GCGCCACAGG 
CCAAGATATG 
CAACAACGGA 
CCGCAAAATT 
CGATTCTTCA 
GTAGCAATTT 
CCGTCGCAAA 
TGATAATTTA 
TAAATGAGTC 
AAATTTACTA 
ATATGTCATC 
TCAGGCGTTC 
ATCCCCGTCA 
CCTGACGGGG 
ATCTGACTTA 
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801 
851 
901 
951 
1001 
1051 
1101 
1151 
1201 
1251 
1301 
1351 
1401 
1451 
1501 
1551 
1601 
1651 
1701 

1 
51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 



CGGGGCGGAA 
AACCGGCAAA 
GTGCTGCATT 
GTTTGCCGCA 
ACAGCGGCGA 
GATGGAAACG 
TTCCGGAAGG 
GCTATCGCCC 
AAAAAAGAGC 
CGAATATCAC 
CCAACGTCGG 
GCAAAACGCG 
AAGCGGTTCG 
ATGCCGCCCA 
AACGGCAAAA 
AACCGCCCCC 
CGATGGCGAA 
CGCACCAAAT 
CGTCCGCATC 

MASPDVKSAD 
AAVSAENTGN 
DSAPASNPAP 
LDEEAPSKSE 
IYKDKSASSS 
HSGNIFAPEG 
VLHFHTENGR 
DGNGFKGTWT 
KKEQDGSGGG 
AKRDGKIDIT 
NGKKLVSVDG 
RTKWGVDYLV 



AAATTGCCCG 
AGGCGAAATG 
TTCATACGGA 
AAAGTCGATT 
TGATTTGCAT 
GCTTTAAGGG 
TTTTACGGCC 
GACAGATGCG 
AGGATGGATC 
GCCAACGCCC 
CGGTTTTTAC 
ACGGTAAAAT 
CAACACTTTA 
ATATCCGGAC 
AACTGGTTTC 
GTCAAACTCA 
AACCGAAGTT 
GGGGCGTGGA 
GACATCCAAA 

TLSKPAAPW 
GGAATTDKPK 
ANGGSNFGRV 
FENLNESERI 
SARFRRSARS 
NYRYLTYGAE 
PYPTRGRFAA 
ENGGGDVSGR 
GATYKVDEYH 
IPVANLQSGS 
NLTMHGKTAP 
NVGMTKSVRI 



GCGGATCGTA 
CTTGCTGGCA 
AAACGGCCGT 
TCGGCAGCAA 
ATGGGTACGC 
GACTTGGACG 
CGGCCGGCGA 
GAAAAGGGCG 
CGGAGGAGGA 
GTTTCGCCAT 
GGTCTGACCG 
CGACATCACC 
CCGACCACCT 
ATCCGCTTTG 
CGTTGACGGC 
AAGCCGAAAA 
TGCGGCGGCG 
CTACCTCGTT 
TCGAGGCAGC 

AEKETEVKED 
NEDEGPQNDM 
DLANGVLIDG 
EKYKKDGKSD 
RRSLPAEMPL 
KLPGGSYALR 
KVDFGSKSVD 
FYGPAGEEVA 
ANARFAIDHF 
QHFTDHLKSA 
VKLKAEKFNC 
DIQIEAAKQ* 



TGCCCTCCGT 
CGGCCGTGTA 
CCGTACCCGA 
ATCTGTGGAC 
AAAAATTCAA 
GAAAATGGCG 
GGAAGTGGCG 
GATTCGGCGT 
GGAGCCACCT 
CGACCATTTC 
GTTCCGTCGA 
ATCCCCGTTG 
GAAATCAGCC 
TTTCCACCAA 
AACCTGACCA 
ATTCAACTGC 
ACTTCAGCAC 
AACGTTGGTA 
CAAACAATAA 

APQAGSQGQG 
PQNSAESANQ 
PSQNITLTHC 
KFTNLVATAV 
IPVNQADTLI 
VQGEPAKGEM 
GIIDSGDDLH 
GKYSYRPTDA 
NTSTNVGGFY 
DIFDAAQYPD 
YQSPMAKTEV 



GTGCAAGGCG 
CAACGGCGAA 
CTAGAGGCAG 
GGCATTATCG 
AGCCGCCATC 
GCGGGGATGT 
GGAAAATACA 
GTTTGCCGGC 
ACAAAGTGGA 
AACACCAGCA 
GTTCGACCAA 
CCAACCTGCA 
GACATCTTCG 
ATTCAACTTC 
TGCACGGCAA 
TACCAAAGCC 
CACCATCGAC 
TGACCAAAAG 
CTCGAG 

APSTQGSQDM 
TGNNQPADSS 
KGDSCNGDNL 
QANGTNKYVI 
VDGEAVSLTG 
LAGTAVYNGE 
MGTQKFKAAI 
EKGGFGVFAG 
GLTGSVEFDQ 
IRFVSTKFNF 
CGGDFSTTID 
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AG287-961 



1 


ATGGCTAGCC 


CCGATGTTAA 


51 


TCCTGTTGTT 


GCTGAAAAAG 


101 


CAGGTTCTCA 


AGGACAGGGC 


151 


GCGGCAGTTT 


CGGCAGAAAA 


201 


CAAACCCAAA 


AATGAAGACG 


251 


CCGCCGAATC 


CGCAAATCAA 


301 


GATTCCGCCC 


CCGCGTCAAA 


351 


TGGAAGGGTT 


GATTTGGCTA 


401 


ATATAACGTT 


GACCCACTGT 


451 


TTGGATGAAG 


AAGCACCGTC 


501 


TGAACGAATT 


GAGAAATATA 


551 


ATTTGGTTGC 


GACAGCAGTT 


601 


ATTTATAAAG 


ACAAGTCCGC 


651 


TGCACGGTCG 


AGGAGGTCGC 


701 


ATCAGGCGGA 


TACGCTGATT 


751 


CATTCCGGCA 


ATATCTTCGC 


801 


CGGGGCGGAA 


AAATTGCCCG 


851 


AACCGGCAAA 


AGGCGAAATG 


901 


GTGCTGCATT 


TTCATACGGA 


951 


GTTTGCCGCA 


AAAGTCGATT 


1001 


ACAGCGGCGA 


TGATTTGCAT 


1051 


GATGGAAACG 


GCTTTAAGGG 


1101 


TTCCGGAAGG 


TTTTACGGCC 


1151 


GCTATCGCCC 


GACAGATGCG 


1201 


AAAAAAGAGC 


AGGATGGATC 


1251 


TGTTAAAAAA 


GCTGCCACTG 


1301 


AAGAAATCAA 


CGGTTTCAAA 


1351 


GACGGCACAA 


TTACCAAAAA 


1401 


CGACTTTAAA 


GGTCTGGGTC 


1451 


CCGTCAATGA 


AAACAAACAA 


1501 


TCTGAAATAG 


AAAAGTTAAC 


1551 


AGCAGATACT 


GATGCCGCTC 



ATCGGCGGAC 
AGACAGAGGT 
GCGCCATCCA 
TACAGGCAAT 
AGGGACCGCA 
ACAGGGAACA 
CCCTGCACCT 
ATGGCGTTTT 
AAAGGCGATT 
AAAATCAGAA 
AGAAAGATGG 
CAAGCTAATG 
TTCATCTTCA 
TTCCTGCCGA 
GTCGATGGGG 
GCCCGAAGGG 
GCGGATCGTA 
CTTGCTGGCA 
AAACGGCCGT 
TCGGCAGCAA 
ATGGGTACGC 
GACTTGGACG 
CGGCCGGCGA 
GAAAAGGGCG 
CGGAGGAGGA 
TGGCCATTGC 
GCTGGAGAGA 
AGACGCAACT 
TGAAAAAAGT 
AACGTCGATG 
AACCAAGTTA 
TGGATGCAAC 



ACGCTGTCAA 
AAAAGAAGAT 
CACAAGGCAG 
GGCGGTGCGG 
AAATGATATG 
ACCAACCCGC 
GCGAATGGCG 
GATTGATGGG 
CTTGTAATGG 
TTTGAAAATT 
GAAAAGCGAT 
GAACTAACAA 
TCTGCGCGAT 
GATGCCGCTA 
AAGCGGTCAG 
AATTACCGGT 
TGCCCTCCGT 
CGGCCGTGTA 
CCGTACCCGA 
ATCTGTGGAC 
AAAAATTCAA 
GAAAATGGCG 
GGAAGTGGCG 
GATTCGGCGT 
GGAGCCACAA 
TGCTGCCTAC 
CCATCTACGA 
GCAGCCGATG 
CGTGACTAAC 
CCAAAGTAAA 
GCAGACACTG 
CACCAACGCC 



AACCGGCCGC 
GCGCCACAGG 
CCAAGATATG 
CAACAACGGA 
CCGCAAAATT 
CGATTCTTCA 
GTAGCAATTT 
CCGTCGCAAA 
TGATAATTTA 
TAAATGAGTC 
AAATTTACTA 
ATATGTCATC 
TCAGGCGTTC 
ATCCCCGTCA 
CCTGACGGGG 
ATCTGACTTA 
GTGCAAGGCG 
CAACGGCGAA 
CTAGAGGCAG 
GGCATTATCG 
AGCCGCCATC 
GCGGGGATGT 
GGAAAATACA 
GTTTGCCGGC 
ACGACGACGA 
AACAATGGCC 
CATTGATGAA 
TTGAAGCCGA 
CTGACCAAAA 
AGCTGCAGAA 
ATGCCGCTTT 
TTGAATAAAT 
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1601 TGGGAGAAAA TATAACGACA 

1651 AAAATTGATG AAAAATTAGA 

1701 CGAAGCATTC AACGATATCG 

1751 CAGACGAAGC CGTCAAAACC 

1801 ACCAAACAAA ACGTCGATGC 

1851 CAAAGCCGAA GCTGCCGCTG 

1901 AAGCTGTCGC TGCAAAAGTT 

1951 AAAGATAATA TTGCTAAAAA 

2001 AGAGTCTGAC AGCAAATTTG 

2051 AAAAATTGGA CACACGCTTG 

2101 GATACTCGCC TGAACGGTTT 

2151 AACCCGCCAA GGCCTTGCAG 

2201 CTTACAACGT GGGTCGGTTC 

2251 TCCGAATCGG CAGTCGCCAT 

2301 TGCCGCCAAA GCAGGCGTGG 

2351 CCTACCATGT CGGCGTCAAT 



TTTGCTGAAG AGACTAAGAC AAATATCGTA 
AGCCGTGGCT GATACCGTCG ACAAGCATGC 
CCGATTCATT GGATGAAACC AACACTAAGG 
GCCAATGAAG CCAAACAGAC GGCCGAAGAA 
CAAAGTAAAA GCTGCAGAAA CTGCAGCAGG 
GCACAGCTAA TACTGCAGCC GACAAGGCCG 
ACCGACATCA AAGCTGATAT CGCTACGAAC 
AGCAAACAGT GCCGACGTGT ACACCAGAGA 
TCAGAATTGA TGGTCTGAAC GCTACTACCG 
GCTTCTGCTG AAAAATCCAT TGCCGATCAC 
GGATAAAACA GTGTCAGACC TGCGCAAAGA 
AACAAGCCGC GCTCTCCGGT CTGTTCCAAC 
AATGTAACGG CTGCAGTCGG CGGCTACAAA 
CGGTACCGGC TTCCGCTTTA CCGAAAACTT 
CAGTCGGCAC TTCGTCCGGT TCTTCCGCAG 
TACGAGTGGT AACTCGAG 



1 


MASPDVKSAD 


TLSKPAAPW 


51 


AAVSAENTGN 


GGAATTDKPK 


101 


DSAPASNPAP 


ANGGSNFGRV 


151 


LDEEAPSKSE 


FENLNESERI 


201 


IYKDKSASSS 


SARFRRSARS 


251 


HSGNIFAPEG 


NYRYLTYGAE 


301 


VLHFHTENGR 


PYPTRGRFAA 


351 


DGNGFKGTWT 


ENGGGDVSGR 


401 


KKEQDGSGGG 


GATNDDDVKK 


451 


DGTITKKDAT 


AADVEADDFK 


501 


SEIEKLTTKL 


ADTDAALADT 


551 


KIDEKLEAVA 


DTVDKHAEAF 


601 


TKQNVDAKVK 


AAETAAGKAE 


651 


KDNIAKKANS 


ABVYTREESD 


701 


DTRLNGLDKT 


VSDLRKETRQ 


751 


SESAVAIGTG 


FRFTENFAAK 



AEKETEVKED 
NEDEGPQNDM 
DLANGVLIDG 
EKYKKDGKSD 
RRSLPAEMPL 
KLPGGSYALR 
KVDFGSKSVD 
FYGPAGEEVA 
AATVAIAAAY 
GLGLKKWTN 
DAALDATTNA 
NDIADSLDET 
AAAGTANTAA 
SKFVRIDGLN 
GLAEQAALSG 
AGVAVGTSSG 



APQAGSQGQG 
PQNSAESANQ 
PSQNITLTHC 
KFTNLVATAV 
IPVNQADTLI 
VQGEPAKGEM 
GIIDSGDDLH 
GKYSYRPTDA 
NNGQEXNGFK 
LTKTVNENKQ 
LNKLGENITT 
NTKADEAVKT 
DKAEAVAAKV 
ATTEKLDTRL 
LFQPYNVGRF 
SSAAYHVGVN 



APSTQGSQDM 
TGNNQPADSS 
KGDSCNGDNL 
QANGTNKYVI 
VDGEAVSLTG 
LAGTAVYNGE 
MGTQKFKAAI 
EKGGFGVFAG 
AGETIYDIDE 
NVDAKVKAAE 
FAEETKTNIV 
ANEAKQTAEE 
TDIKADIATN 
ASAEKSIADH 
NVTAAVGGYK 
YEW* 





ELISA 


Bactericidal 


AG287-953-His 


. 3834 


65536 


AG287-961-ffis 


108627 


65536 



The bactericidal efficacy (homologous strain) of antibodies raised against the hybrid proteins 
was compared with antibodies raised against simple mixtures of the component antigens 
(using 287-GST) for 919 and ORF46.1: 





Mixture with 287 


Hybrid with AG287 


919 


32000 


128000 


ORF46.1 


128 


16000 



Data for bactericidal activity against heterologous MenB strains and against serotypes A and 
C were also obtained: 
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919 


ORF46.1 


Strain 


Mixture 


Hybrid 


Mixture 


Hybrid 


NGH38 


1024 


32000 




16384 


MC58 


512 


8192 




512 


BZ232 


512 


512 






MenA (F6124) 


512 


32000 




8192 


MenC(Cll) 


>2048 


>2048 






MenC{BZ133) 


>4096 


64000 




8192 



The hybrid proteins with AG287 at the N-terminus are therefore immunologically superior to 
simple mixtures, with AG287-ORF46.1 being particularly effective, even against 
heterologous strains. AG287-ORF46.1K may be expressed in pET-24b. 

The same hybrid proteins were made using New Zealand strain 394/98 rather than 2996: 



AG287NZ-919 



1 
51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 
601 
651 
701 
751 
801 
851 
901 
951 
1001 
1051 
1101 
1151 
1201 
1251 
1301 
1351 
1401 
1451 
1501 
1551 
1601 
1651 
1701 
1751 
1801 
1851 
1901 
1951 
2001 



ATGGCTAGCC 
CCCTGTTGTT 
CAGGTTCTCA 
GCGGCGGTTT 
CAAACCCAAA 
CCGCCGATAC 
CCGGCCGGAA 
GCCGGCAAAC 
ACGATCCGTC 
ACAAATCAAG 
TTCAACCAAT 
ACGTGGGCAA 
ACCCACTGTA 
AGTACAGCTA 
GTAATTACAA 
GGTTTGGTTG 
CTTTTATAAA 
GGTCGAGGCG 
GCGGATACGC 
CGGCAATATC 
CGGAAAAATT 
TCAAAAGGCG 
GCATTTTCAT 
CCGCAAAAGT 
GGCGATGGTT 
AAACGGCTTT 
GAAAGTTTTA 
CGCCCAACAG 
AGAGCAGGAT 
CCTTTCCGCA 
GGCATCCCCG 
TACCGTTGTA 
CCAAAAGCCT 
CAAGGCTGGC 
CTTTCAGGCA 
CAGGCAACGG 
CTGAAGGGCG 
TATTCCCGAC 
GAAAAGCCCT 
GACAATACCG 
CGCGCGCACA 



CCGATGTCAA 
TCTGAAAAAG 
AGGACAGGGC 
CGGAAGAAAA 
AATGAAGACG 
AGATAGTTTG 
ATATGGAAAA 
CAACCGGATA 
GGCAGGCGGG 
CCGAAAACAA 
CCTAGCGCCA 
TTCTGTTGTG 
AAGGCGATTC 
AAATCAGAAT 
GAAAGATGGG 
CCGATAGTGT 
CCTAAACCCA 
GTCGCTTCCG 
TGATTGTCGA 
TTCGCGCCCG 
GCCCGGCGGA 
AAATGCTCGC 
ACGGAAAACG 
CGATTTCGGC 
TGCATATGGG 
AAGGGGACTT 
CGGCCCGGCC 
ATGCGGAAAA 
GGATCCGGAG 
ACCCGACACA 
ACCCCGCCGG 
CCGCACCTGT 
GCAATCCTTC 
AGGATGTGTG 
AAACAGTTTT 
AAGCCTTGCC 
ACGACAGGCG 
GATTTTATCT 
TGTCCGCATC 
GCGGCACACA 
ACGGCAATCA 



GTCGGCGGAC 
AGACAGAGGC 
GCGCCATCCG 
TACAGGCAAT 
AGGGGGCGCA 
ACACCGAATC 
CCAAGCACCG 
TGGCAAATAC 
GAAAATGCCG 
TCAAACCGCC 
CGAATAGCGG 
ATTGACGGGC 
TTGTAGTGGC 
TTGAAAAATT 
AAGAATGACG 
GCAGATGAAG 
CTTCATTTGC 
GCCGAGATGC 
TGGGGAAGCG 
AAGGGAATTA 
TCGTATGCCC 
GGGCACGGCA 
GCCGTCCGTC 
AGCAAATCTG 
TACGCAAAAA 
GGACGGAAAA 
GGCGAGGAAG 
GGGCGGATTC 
GAGGAGGATG 
TCCGTCATCA 
AACGACGGTC 
CCCTGCCCCA 
CGCCTCGGCT 
CGCCCAAGCC 
TTGAACGCTA 
GGTACGGTTA 
GACGGCACAA 
CCGTCCCCCT 
AGGCAGACGG 
TACCGCCGAC 
AAGGCAGGTT 



ACGCTGTCAA 
AAAGGAAGAT 
CACAAGGCGG 
GGCGGTGCGG 
AAATGATATG 
ACACCCCGGC 
GATGCCGGGG 
GGCGGACGGA 
GCAATACGGC 
GGTTCTCAAA 
TGGTGATTTT 
CGTCGCAAAA 
AATAATTTCT 
AAGTGATGCA 
GGAAGAATGA 
GGAATCAATC 
GCGATTTAGG 
CGCTGATTCC 
GTCAGCCTGA 
CCGGTATCTG 
TCCGTGTTCA 
GTGTACAACG 
CCCGTCCAGA 
TGGACGGCAT 
TTCAAAGCCG 
TGGCGGCGGG 
TGGCGGGAAA 
GGCGTGTTTG 
CCAAAGCAAG 
ACGGCCCGGA 
GGCGGCGGCG 
CTGGGCGGCG 
GCGCCAATTT 
TTTCAAACCC 
TTTCACGCCG 
CCGGCTATTA 
GCCCGCTTCC 
GCCTGCCGGT 
GAAAAAACAG 
CTCTCCCGAT 
TGAAGGAAGC 



AACCTGCCGC 
GCGCCACAGG 
TCAAGATATG 
CAGCAACGGA 
CCGCAAAATG 
TTCGAATATG 
AATCGGAGCA 
ATGCAGGGTG 
TGCCCAAGGT 
ATCCTGCCTC 
GGAAGGACGA 
TATAACGTTG 
TGGATGAAGA 
GACAAAATAA 
TAAATTTGTC 
AATATATTAT 
CGTTCTGCAC 
CGTCAATCAG 
CGGGGCATTC 
ACTTACGGGG 
AGGCGAACCT 
GCGAAGTGCT 
GGCAGGTTTG 
TATCGACAGC 
CCATCGATGG 
GATGTTTCCG 
ATACAGCTAT 
CCGGCAAAAA 
AGCATCCAAA 
CCGGCCGGTC 
GGGCCGTCTA 
CAGGATTTCG 
GAAAAACCGC 
CCGTCCATTC 
TGGCAGGTTG 
CGAGCCGGTG 
CGATTTACGG 
TTGCGGAGCG 
CGGCACAATC 
TCCCCATCAC 
CGCTTCCTCC 
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2051 
2101 
2151 
2201 
2251 
2301 
2351 
2401 
2451 
2501 
2551 
2601 
2651 

1 
51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 
601 
651 
701 
751 
801 
851 



CCTACCACAC 
CCGATACTCG 
CCAAGGCTCG 
GCTATGCCGA 
GCGGACAAAG 
AGCCTATATG 
ACCCCAGCTA 
CCCGTCGGCG 
CGACCGGCAC 
ATCCGGTTAC 
GGCAGCGCGA 
CGACGAAGCC 
GGCAGCTCCT 

MASPDVKSAD 
AAVSEENTGN 
PAGNMENQAP 
TNQAENNQTA 
THCKGDSCSG 
GLVADSVQMK 
ADTLIVDGEA 
SKGEMLAGTA 
GDGLHMGTQK 
RPTDAEKGGF 
GIPDPAGTTV 
QGWQDVCAQA 
IiKGDDRRTAQ 
DNTGGTHTAD 
PILGYAEDPV 
ADKGYLKLGQ 
PVGAIjGTPLM 
GSAIKGAVRV 



GCGCAACCAA 
GTTACGCCGA 
GGCCGTCTGA 
CAAAAACGAA 
GCTACCTCAA 
CGGCAAAATC 
TATCTTTTTC 
CACTGGGCAC 
TACATTACCT 
CCGCAAAGCC 
TTAAAGGCGC 
GGCGAACTTG 
ACCCAACGGT 

TLSKPAAPW 
GGAAATDKPK 
DAGESEQPAN 
GSQNPASSTN 
NNFLDEEVQL 
GINQYIIFYK 
VSLTGHSGNI 
VYNGEVLHFH 
FKAAIDGNGF 
GVFAGKKEQD 
GGGGAVYTW 
FQTPVHSFQA 
ARFPIYGIPD 
LSRFPITART 
ELFFMHIQGS 
TSMQGIKAYM 
GEYAGAVDRH 
DYFWGYGDEA 



ATCAACGGCG 
AGACCCCGTC 
AAACCCCGTC 
CATCCCTACG 
GCTCGGGCAG 
CGCAACGCCT 
CGCGAGCTTG 
GCCGTTGATG 
TGGGCGCGCC 
CTCAACCGCC 
GGTGCGCGTG 
CCGGCAAACA 
ATGAAGCCCG 

SEKETEAKED 
NEDEGAQNDM 
QPDMANTADG 
PSATNSGGDF 
KSEFEKLSDA 
PKPTSFARFR 
FAPEGNYRYL 
TENGRPSPSR 
KGTWTENGGG 
GSGGGGCQSK 
PHLSLPHWAA 
KQFFERYFTP 
DFISVPLPAG 
TAIKGRFEGS 
GRLKTPSGKY 
RQNPQRLAEV 
YITLGAPLFV 
GELAGKQKTT 



GCGCGCTTGA 
GAACTTTTTT 
CGGCAAATAC 
TTTCCATCGG 
ACCTCGATGC 
CGCCGAAGTT 
CCGGAAGCAG 
GGGGAATATG 
CTTATTTGTC 
TGATTATGGC 
GATTATTTTT 
GAAAACCACG 
AATACCGCCC 

APQAGSQGQG 
PQNAADTDSL 
MQGDDPSAGG 
GRTNVGNSW 
DKISNYKKDG 
RSARSRRSLP 
TYGAEKLPGG 
GRFAAKVDFG 
DVSGKFYGPA 
SIQTFPQPDT 
QDFAKSLQSF 
WQVAGNGSLA 
LRSGKALVRI 
RFLPYHTRNQ 
IRIGYADKNE 
LGQNPSYIFF 
ATAHPVTRKA 
GYVWQLLPNG 



CGGCAAAGCC 
TTATGCACAT 
ATCCGCATCG 
ACGCTATATG 
AGGGCATCAA 
TTGGGTCAAA 
CAATGACGGT 
CCGGCGCAGT 
GCCACCGCCC 
GCAGGATACC 
GGGGATACGG 
GGTTACGTCT 
GTAAAAGCTT 

APSAQGGQDM 
TPNHTPASNM 
ENAGNTAAQG 
IDGPSQNITL 
KNDGKNDKFV 
AEMPLIPVNQ 
SYALRVQGEP 
SKSVDGIIDS 
GEEVAGKYSY 
SVTNGPDRPV 
RLGCANLKNR 
GTVTGYYEPV 
RQTGKNSGTI 
INGGALDGKA 
HP YVS IGRYM 
RELAGSSNDG 
LNRLIMAQDT 
MKPEYRP* 
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AG287NZ-953 



1 
51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 
601 
651 
701 
751 
801 
851 
901 
951 
1001 
1051 
1101 
1151 
1201 
1251 
1301 
1351 
1401 
1451 
1501 
1551 



ATGGCTAGCC 
CCCTGTTGTT 
CAGGTTCTCA 
GCGGCGGTTT 
CAAACCCAAA 
CCGCCGATAC 
CCGGCCGGAA 
GCCGGCAAAC 
ACGATCCGTC 
ACAAATCAAG 
TTCAACCAAT 
ACGTGGGCAA 
ACCCACTGTA 
AGTACAGCTA 
GTAATTACAA 
GGTTTGGTTG" 
CTTTTATAAA 
GGTCGAGGCG 
GCGGATACGC 
CGGCAATATC 
CGGAAAAATT 
TCAAAAGGCG 
GCATTTTCAT 
CCGCAAAAGT 
GGCGATGGTT 
AAACGGCTTT 
GAAAGTTTTA 
CGCCCAACAG 
AGAGCAGGAT 
ATCACGCCAA 
GTCGGCGGTT 
ACGCGACGGT 



CCGATGTCAA 
TCTGAAAAAG 
AGGACAGGGC 
CGGAAGAAAA 
AATGAAGACG 
AGATAGTTTG 
ATATGGAAAA 
CAACCGGATA 
GGCAGGCGGG 
CCGAAAACAA 
CCTAGCGCCA 
TTCTGTTGTG 
AAGGCGATTC 
AAATCAGAAT 
GAAAGATGGG 
CCGATAGTGT 
CCTAAACCCA 
GTCGCTTCCG 
TGATTGTCGA 
TTCGCGCCCG 
GCCCGGCGGA 
AAATGCTCGC 
ACGGAAAACG 
CGATTTCGGC 
TGCATATGGG 
AAGGGGACTT 
CX3GCCCGGCC 
ATGCGGAAAA 
GGATCCGGAG 
CGCCCGTTTC 
TTTACGGTCT 
AAAATCGACA 



GTCGGCGGAC 
AGACAGAGGC 
GCGCCATCCG 
TACAGGCAAT 
AGGGGGCGCA 
ACACCGAATC 
CCAAGCACCG 
TGGCAAATAC 
GAAAATGCCG 
TCAAACCGCC 
CGAATAGCGG 
ATTGACGGGC 
TTGTAGTGGC 
TTGAAAAATT 
AAGAATGACG 
GCAGATGAAG 
CTTCATTTGC 
GCCGAGATGC 
TGGGGAAGCG 
AAGGGAATTA 
TCGTATGCCC 
GGGCACGGCA 
GCCGTCCGTC 
AGCAAATCTG 
TACGCAAAAA 
GGACGGAAAA 
GGCGAGGAAG 
GGGCGGATTC 
GAGGAGGAGC 
GCCATCGACC 
GACCGGTTCC 
TCACCATCCC 



ACGCTGTCAA 
AAAGGAAGAT 
CACAAGGCGG 
GGCGGTGCGG 
AAATGATATG 
ACACCCCGGC 
GATGCCGGGG 
GGCGGACGGA 
GCAATACGGC 
GGTTCTCAAA 
TGGTGATTTT 
CGTCX5CAAAA 
AATAATTTCT 
AAGTGATGCA 
GGAAGAATGA 
GGAATCAATC 
GCGATTTAGG 
CGCTGATTCC 
GTCAGCCTGA 
CCGGTATCTG 
TCCGTGTTCA 
GTGTACAACG 
CCCGTCCAGA 
TGGACGGCAT 
TTCAAAGCCG 
TGGCGGCGGG 
TGGCGGGAAA 
GGCGTGTTTG 
CACCTACAAA 
ATTTCAACAC 
GTCGAGTTCG 
CGTTGCCAAC 



AACCTGCCGC 

GCGCCACAGG 

TCAAGATATG 

CAGCAACGGA . 

CCGCAAAATG 

TTCGAATATG 

AATCGGAGCA 

ATGCAGGGTG 

TGCCCAAGGT 

ATCCTGCCTC 

GGAAGGACGA 

TATAACGTTG 

TGGATGAAGA 

GACAAAATAA 

TAAATTTGTC 

AATATATTAT 

CGTTCTGCAC 

CGTCAATCAG 

CGGGGCATTC 

ACTTACGGGG 

AGGCGAACCT 

GCGAAGTGCT 

GGCAGGTTTG 

TATCGACAGC 

CCATCGATGG 

GATGTTTCCG 

ATACAGCTAT 

CCGGCAAAAA 

GTGGACGAAT 

CAGCACCAAC 

ACCAAGCAAA 

CTGCAAAGCG 
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1601 
1651 
1701 
1751 
1801 
1851 
1901 



GTTCGCAACA 
GCCCAATATC 
CAAAAAACTG 
CCCCCGTCAA 
GCGAAAACCG 
CAAATGGGGC 
GCATCGACAT 



CTTTACCGAC 
CGGACATCCG 
GTTTCCGTTG 
ACTCAAAGCC 
AAGTTTGCGG 
GTGGACTACC 
CCAAATCGAG 



CACCTGAAAT 
CTTTGTTTCC 
ACGGCAACCT 
GAAAAATTCA 
CGGCGACTTC 
TCGTTAACGT 
GCAGCCAAAC 



CAGCCGACAT 
ACCAAATTCA 
GACCATGCAC 
ACTGCTACCA 
AGCACCACCA 
TGGTATGACC 
AATAAAAGCT 



CTTCGATGCC 
ACTTCAACGG 
GGCAAAACCG 
AAGCCCGATG 
TCGACCGCAC 
AAAAGCGTCC 
T 



10 



15 



20 



i 

51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 
601 



MASPDVKSAD 
AAVSEENTGN 
PAGNMENQAP 
TNQAENNQTA 
THCKGDSCSG 
GLVADSVQMK 
ADTLXVDGEA 
SKGEMLAGTA 
GDGLHMGTQK 
RPTDAEKGGF 
VGGFYGLTGS 
AQYPDIRFVS 
AKTEVCGGDF 



TLSKPAAPW 
GGAAATDKPK 
DAGESEQPAN 
GSQNPASSTN 
NNFLDEEVQIi 
GINQYIIFYK 
VSLTGHSGNI 
VYNGEVLHFH 
FKAAIDGNGF 
GVFAGKKEQD 
VEFDQAKRDG 
TKFNFNGKKL 
STTIDRTKWG 



SEKETEAKED 
NEDEGAQNDM 
QPDMANTADG 
PSATNSGGDF 
KSEFEKLSDA 
PKPTSFARFR 
FAPEGNYRYL 
TENGRPSPSR 
KGTWTENGGG 
GSGGGGATYK 
KIDITIPVAN 
VSVDGNLTMH 
VDYLVNVGMT 



APQAGSQGQG 
PQNAAOTDSL 
MQGDDPSAGG 
GRTNVGNSW 
DKISNYKKDG 
RSARSRRSLP 
TYGAEKLPGG 
GRFAAKVDFG 
DVSGKFYGPA 
VDEYHANARF 
LQSGSQHFTD 
GKTAPVKLKA 
KSVRID1QIE 



APSAQGGQDM 
TPNHTPASNM 
ENAGNTAAQG 
IDGPSQNITL 
KNDGKNDKFV 
AEMPLIPVNQ 
SYALKVQGEP 
SKSVDGIIDS 
GEEVAGKYSY 
AIDHFNTSTN 
HLKSADIFDA 
EKFNCYQSPM 
AAKQ* 



25 



30 



35 



40 



45 



50 



55 
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AG287NZ-961 



1 
51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 
601 
651 
701 
751 
801 
851 
901 
951 
1001 
1051 
1101 
1151 
1201 
1251 
1301 
1351 
1401 
1451 
1501 
1551 
1601 
1651 
1701 
1751 
1801 
1851 
1901 
1951 
2001 
2051 
2101 



ATGGCTAGCC 
CCCTGTTGTT 
CAGGTTCTCA 
GCGGCGGTTT 
CAAACCCAAA 
CCGCCGATAC 
CCGGCCGGAA 
GCCGGCAAAC 
ACGATCCGTC 
ACAAATCAAG 
TTCAACCAAT 
ACGTGGGCAA 
ACCCACTGTA 
AGTACAGCyA 
GTAATTACAA 
GGTTTGGTTG 
CTTTTATAAA 
GGTCGAGGCG 
GCGGATACGC 
CGGCAATATC 
CGGAAAAATT 
TCAAAAGGCG 
GCATTTTCAT 
CCGCAAAAGT 
GGCGATGGTT 
AAACGGCTTT 
GAAAGTTTTA 
CGCCCAACAG 
AGAGCAGGAT 
AAAAAGCTGC 
ATCAACGGTT 
CACAATTACC 
TTAAAGGTCT 
AATGAAAACA 
AATAGAAAAG 
ATACTGATGC 
GAAAATATAA 
TGATGAAAAA 
CATTCAACGA 
GAAGCCGTCA 
ACAAAACGTC 
CCGAAGCTGC 
GTCGCTGCAA 



CCGATGTCAA 
TCTGAAAAAG 
AGGACAGGGC 
CGGAAGAAAA 
AATGAAGACG 
AGATAGTTTG 
ATATGGAAAA 
CAACCGGATA 
GGCAGGCGGG 
CCGAAAACAA 
CCTAGCGCCA 
TTCTGTTGTG 
AAGGCGATTC 
AAATCAGAAT 
GAAAGATGGG 
CCGATAGTGT 
CCTAAACCCA 
GTCGCTTCCG 
TGATTGTCGA 
TTCGCGCCCG 
GCCCGGCGGA 
AAATGCTCGC 
ACGGAAAACG 
CGATTTCGGC 
TGCATATGGG 
AAGGGGACTT 
CGGCCCGGCC 
ATGCGGAAAA 
GGATCCGGAG 
CACTGTGGCC 
TCAAAGCTGG 
AAAAAAGACG 
GGGTCTGAAA 
AACAAAACGT 
TTAACAACCA 
CGCTCTGGAT 
CGACATTTGC 
TTAGAAGCCG 
TATCGCCGAT 
AAACCGCCAA 
GATGCCAAAG 
CGCTGGCACA 
AAGTTACCGA 



GTCGGCGGAC 
AGACAGAGGC 
GCGCCATCCG 
TACAGGCAAT 
AGGGGGCGCA 
ACACCGAATC 
CCAAGCACCG 
TGGCAAATAC 
GAAAATGCCG 
TCAAACCGCC 
CGAATAGCGG 
ATTGACGGGC 
TTGTAGTGGC 
TTGAAAAATT 
AAGAATGACG 
GCAGATGAAG 
CTTCATTTGC 
GCCGAGATGC 
TGGGGAAGCG 
AAGGGAATTA 
TCGTATGCCC 
GGGCACGGCA 
GCCGTCCGTC 
AGCAAATCTG 
TACGCAAAAA 
GGACGGAAAA 
GGCGAGGAAG 
GGGCGGATTC 
GAGGAGGAGC 
ATTGCTGCTG 
AGAGACCATC 
CAACTGCAGC 
AAAGTCGTGA 
CGATGCCAAA 
AGTTAGCAGA 
GCAACCACCA 
TGAAGAGACT 
TGGCTGATAC 
TCATTGGATG 
TGAAGCCAAA 
TAAAAGCTGC 
GCTAATACTG 
CATCAAAGCT 



ACGCTGTCAA 
AAAGGAAGAT 
CACAAGGCGG 
GGCGGTGCGG 
AAATGATATG 
ACACCCCGGC 
GATGCCGGGG 
GGCGGACGGA 
GCAATACGGC 
GGTTCTCAAA 
TGGTGATTTT 
CGTCGCAAAA 
AATAATTTCT 
AAGTGATGCA 
GGAAGAATGA 
GGAATCAATC 
GCGATTTAGG 
CGCTGATTCC 
GTCAGCCTGA 
CCGGTATCTG 
TCCGTGTTCA 
GTGTACAACG 
CCCGTCCAGA 
TGGACGGCAT 
TTCAAAGCCG 
TGGCGGCGGG 
TGGCGGGAAA 
GGCGTGTTTG 
CACAAACGAC 
CCTACAACAA 
TACGACATTG 
CGATGTTGAA 
CTAACCTGAC 
GTAAAAGCTG 
CACTGATGCC 
ACGCCTTGAA 
AAGACAAATA 
CGTCGACAAG 
AAACCAACAC 
CAGACGGCCG 
AGAAACTGCA 
CAGCCGACAA 
GATATCGCTA 



AACCTGCCGC 
GCGCCACAGG 
TCAAGATATG 
CAGCAACGGA 
CCGCAAAATG 
TTCGAATATG 
AATCGGAGCA 
ATGCAGGGTG 
TGCCCAAGGT 
ATCCTGCCTC 
GGAAGGACGA 
TATAACGTTG 
TGGATGAAGA 
GACAAAATAA 
TAAATTTGTC 
AATATATTAT 
CGTTCTGCAC 
CGTCAATCAG 
CGGGGCATTC 
ACTTACGGGG 
AGGCGAACCT 
GCGAAGTGCT 
GGCAGGTTTG 
TATCGACAGC 
CCATCGATGG 
GATGTTTCCG 
ATACAGCTAT 
CCGGCAAAAA 
GACGATGTTA 
TGGCCAAGAA 
ATGAAGACGG 
GCCGACGACT 
CAAAACCGTC 
CAGAATCTGA 
GCTTTAGCAG 
TAAATTGGGA 
TCGTAAAAAT 
CATGCCGAAG 
TAAGGCAGAC 
AAGAAACCAA 
GCAGGCAAAG 
GGCCGAAGCT 
CGAACAAAGA 
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2151 TAATATTGCT AAAAAAGCAA 

2201 CTGACAGCAA ATTTGTCAGA 

2251 TTGGACACAC GCTTGGCTTC 

• 2301 TCGCCTGAAC GGTTTGGATA 

2351 GCCAAGGCCT TGCAGAACAA 

2401 AACGTGGGTC GGTTCAATGT 

2451 ATCGGCAGTC GCCATCGGTA 

2501 CCAAAGCAGG CGTGGCAGTC 

2551 CATGTCGGCG TCAATTACGA 



ACAGTGCCGA CGTGTACACC AGAGAAGAGT 
ATTGATGGTC TGAACGCTAC TACCGAAAAA 
TGCTGAAAAA TCCATTGCCG ATCACGATAC 
AAACAGTGTC AGACCTGCGC AAAGAAACCC 
GCCGCGCTCT CCGGTCTGTT CCAACCTTAC 
AACGGCTGCA GTCGGCGGCT ACAAATCCGA 
CCGGCTTCCG CTTTACCGAA AACTTTGCCG 
GGCACTTCGT CCGGTTCTTC CGCAGCCTAC 
GTGGTAAAAG CTT 



1 MASPDVKSAD TLSKPAAPW SEKETEAKED APQAGSQGQG APSAQGGQDM 

51 AAVSEENTGN GGAAATDKPK NEDEGAQNDM PQNAADTDSL TPNHTPASNM 

101 PAGNMENQAP DAGESEQPAN QPDMANTADG MQGDDPSAGG ENAGNTAAQG 

151 TNQAENNQTA GSQNPASSTN PSATNSGGDF GRTNVGNSW IDGPSQNITL 

201 THCKGDSCSG NNFLDEEVQL KSEFEKLSDA DKISNYKKDG KNDGKNDKFV 

251 GLVADSVQMK GINQYIIFYK PKPTSFARFR RSARSRRSLP AEMPLIPVNQ 

301 ADTLIVDGEA VSLTGHSGNI FAPEGNYRYL TYGAEKLPGG SYALRVQGEP 

351 SKGEMLAGTA VYNGEVLHFH TENGRPSPSR GRFAAKVDFG SKSVDGIIDS 

401 GDGLHMGTQK FKAAIDGNGF KGTWTENGGG DVSGKFYGPA GEEVAGKYSY 

451 RPTDAEKGGF GVFAGKKEQD GSGGGGATND DDVKKAATVA IAAAYNNGQE 

501 INGFKAGETI YDIDEDGTIT KKDATAADVE ADDFKGLGLK KWTNIiTKTV 

551 NENKQNVDAK VKAAESEIEK LTTKLADTDA ALADTDAALD ATTNALNKLG 

601 ENITTFAEET KTNIVKIDEK LEAVADTVDK HAEAFNDIAD SLDETNTKAD 

651 EAVKTANEAK QTAEETKQNV DAKVKAAETA AGKAEAAAGT ANTAADKAEA 

701 VAAKVTDIKA DIATNKDNIA KKANSADVYT REESDSKFVR IDGLNATTEK 

751 LDTRLASAEK SIADHDTRLN GLDKTVSDLR KETRQGLAEQ AALSGIiFQPY 

801 NVGRFNVTAA VGGYKSESAV AIGTGFRFTE NFAAKAGVAV GTSSGSSAAY 

851 HVGVNYEW* 

AG983 and hybrids 

Bactericidal titres generated in response to AG983 (His-fusion) were measured against 
various strains, including the homologous 2996 strain: 





2996 


NGH38 


BZ133 


AG983 


512 


128 


128 



AG983 was also expressed as a hybrid, with ORF46.1, 741, 961 or 961c at its C-terminus: 

AG983-ORF46.1 

1 ATGACTTCTG CGCCCGACTT CAATGCAGGC GGTACCGGTA TCGGCAGCAA 

51 CAGCAGAGCA ACAACAGCGA AATCAGCAGC AGTATCTTAC GCCGGTATCA 

101 AGAACGAAAT GTGCAAAGAC AGAAGCATGC TCTGTGCCGG TCGGGATGAC 

151 GTTGCGGTTA CAGACAGGGA TGCCAAAATC AATGCCCCCC CCCCGAATCT 

201 GCATACCGGA GACTTTCCAA ACCCAAATGA CGCATACAAG AATTTGATCA 

251 ACCTCAAACC TGCAATTGAA GCAGGCTATA CAGGACGCGG GGTAGAGGTA 

301 GGTATCGTCG ACACAGGCGA ATCCGTCGGC AGCATATCCT TTCCCGAACT 

351 GTATGGCAGA AAAGAACACG GCTATAACGA AAATTACAAA AACTATACGG 

401 CGTATATGCG GAAGGAAGCG CCTGAAGACG GAGGCGGTAA AGACATTGAA 

451 GCTTCTTTCG ACGATGAGGC CGTTATAGAG ACTGAAGCAA AGCCGACGGA 

501 TATCCGCCAC GTAAAAGAAA TCGGACACAT CGATTTGGTC TCCCATATTA 

551 TTGGCGGGCG TTCCGTGGAC GGCAGACCTG CAGGCGGTAT TGCGCCCGAT 

601 GCGACGCTAC ACATAATGAA TACGAATGAT GAAACCAAGA ACGAAATGAT 

651 GGTTGCAGCC ATCCGCAATG CATGGGTCAA GCTGGGCGAA CGTGGCGTGC 

701 GCATCGTCAA TAACAGTTTT GGAACAACAT CGAGGGCAGG CACTGCCGAC 

751 CTTTTCCAAA TAGCCAATTC GGAGGAGCAG TACCGCCAAG CGTTGCTCGA 

801 CTATTCCGGC GGTGATAAAA CAGACGAGGG TATCCGCCTG ATGCAACAGA 

851 GCGATTACGG CAACCTGTCC TACCACATCC GTAATAAAAA CATGCTTTTC 

901 ATCTTTTCGA CAGGCAATGA CGCACAAGCT CAGCCCAACA CATATGCCCT 

951 ATTGCCATTT TATGAAAAAG ACGCTCAAAA AGGCATTATC ACAGTCGCAG 

1001 GCGTAGACCG CAGTGGAGAA AAGTTCAAAC GGGAAATGTA TGGAGAACCG 

1051 GGTACAGAAC CGCTTGAGTA TGGCTCCAAC CATTGCGGAA TTACTGCCAT 

1101 GTGGTGCCTG TCGGCACCCT ATGAAGCAAG CGTCCGTTTC ACCCGTACAA 
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1151 ACCCGATTCA AATTGCCGGA ACATCCTTTT CCGCACCCAT CGTAACCGGC 

1201 ACGGCGGCTC TGCTGCTGCA GAAATACCCG TGGATGAGCA ACGACAACCT 

1251 GCGTACCACG TTGCTGACGA CGGCTCAGGA CATCGGTGCA GTCGGCGTGG 

1301 ACAGCAAGTT CGGCTGGGGA CTGCTGGATG CGGGTAAGGC CATGAACGGA 

5 1351 CCCGCGTCCT TTCCGTTCGG CGACTTTACC GCCGATACGA AAGGTACATC 

1401 CGATATTGCC TACTCCTTCC GTAACGACAT TTCAGGCACG GGCGGCCTGA 

1451 TCAAAAAAGG CGGCAGCCAA CTGCAACTGC ACGGCAACAA CACCTATACG 

1501 GGCAAAACCA TTATCGAAGG CGGTTCGCTG GTGTTGTACG GCAACAACAA 

1551 ATCGGATATG CGCGTCGAAA CCAAAGGTGC GCTGATTTAT AACGGGGCGG 

10 1601 CATCCGGCGG CAGCCTGAAC AGCGACGGCA TTGTCTATCT GGCAGATACC 

1651 GACCAATCCG GCGCAAACGA AACCGTACAC ATCAAAGGCA GTCTGCAGCT 

1701 GGACGGCAAA GGTACGCTGT ACACACGTTT GGGCAAACTG CTGAAAGTGG 

1751 ACGGTACGGC GATTATCGGC GGCAAGCTGT ACATGTCGGC ACGCGGCAAG 

1801 GGGGCAGGCT ATCTCAACAG TACCGGACGA CGTGTTCCCT TCCTGAGTGC 

15 1851 CGCCAAAATC GGGCAGGATT ATTCTTTCTT CACAAACATC GAAACCGACG 

1901 GCGGCCTGCT GGCTTCCCTC GACAGCGTCG AAAAAACAGC GGGCAGTGAA 

1951 GGCGACACGC TGTCCTATTA TGTCCGTCGC GGCAATGCGG CACGGACTGC 

2001 TTCGGCAGCG GCACATTCCG CGCCCGCCGG TCTGAAACAC GCCGTAGAAC 

2051 AGGGCGGCAG CAATCTGGAA AACCTGATGG TCGAACTGGA TGCCTCCGAA 

20 2101 TCATCCGCAA CACCCGAGAC GGTTGAAACT GCGGCAGCCG ACCGCACAGA 

2151 TATGCCGGGC ATCCGCCCCT ACGGCGCAAC TTTCCGCGCA GCGGCAGCCG 

2201 TACAGCATGC GAATGCCGCC GACGGTGTAC GCATCTTCAA CAGTCTCGCC 

2251 GCTACCGTCT ATGCCGACAG TACCGCCGCC CATGCCGATA TGCAGGGACG 

2301 CCGCCTGAAA GCCGTATCGG ACGGGTTGGA CCACAACGGC ACGGGTCTGC 

25 2351 GCGTCATCGC GCAAACCCAA CAGGACGGTG GAACGTGGGA ACAGGGCGGT 

2401 GTTGAAGGCA AAATGCGCGG CAGTACCCAA ACCGTCGGCA TTGCCGCGAA 

2451 AACCGGCGAA AATACGACAG CAGCCGCCAC ACTGGGCATG GGACGCAGCA 

2501 CATGGAGCGA AAACAGTGCA AATGCAAAAA CCGACAGCAT TAGTCTGTTT 

2551 GCAGGCATAC GGCACGATGC GGGCGATATC GGCTATCTCA AAGGCCTGTT 

30 2601 CTCCTACGGA CGCTACAAAA ACAGCATCAG CCGCAGCACC GGTGCGGACG 

2651 AACATGCGGA AGGCAGCGTC AACGGCACGC TGATGCAGCT GGGCGCACTG 

2701 GGCGGTGTCA ACGTTCCGTT TGCCGCAACG GGAGATTTGA CGGTCGAAGG 

2751 CGGTCTGCGC TACGACCTGC TCAAACAGGA TGCATTCGCC GAAAAAGGCA 

2801 GTGCTTTGGG CTGGAGCGGC AACAGCCTCA CTGAAGGCAC GCTGGTCGGA 

35 2851 CTCGCGGGTC TGAAGCTGTC GCAACCCTTG AGCGATAAAG CCGTCCTGTT 

2901 TGCAACGGCG GGCGTGGAAC GCGACCTGAA CGGACGCGAC TACACGGTAA 

2951 CGGGCGGCTT TACCGGCGCG ACTGCAGCAA CCGGCAAGAC GGGGGCACGC 

3001 AATATGCCGC ACACCCGTCT GGTTGCCGGC CTGGGCGCGG ATGTCGAATT 

3051 CGGCAACGGC TGGAACGGCT TGGCACGTTA CAGCTACGCC GGTTCCAAAC 

40 3101 AGTACGGCAA CCACAGCGGA CGAGTCGGCG TAGGCTACCG GTTCCTCGAC 

3151 GGTGGCGGAG GCACTGGATC CTCAGATTTG GCAAACGATT CTTTTATCCG 

3201 GCAGGTTCTC GACCGTCAGC ATTTCGAACC CGACGGGAAA TACCACCTAT 

3251 TCGGCAGCAG GGGGGAACTT GCCGAGCGCA GCGGCCATAT CGGATTGGGA 

3301 AAAATACAAA GCCATCAGTT GGGCAACCTG ATGATTCAAC AGGCGGCCAT 

45 3351 TAAAGGAAAT ATCGGCTACA TTGTCCGCTT TTCCGATCAC GGGCACGAAG 

3401 TCCATTCCCC CTTCGACAAC CATGCCTCAC ATTCCGATTC TGATGAAGCC 

3451 GGTAGTCCCG TTGACGGATT TAGCCTTTAC CGCATCCATT GGGACGGATA 

3501 CGAACACCAT CCCGCCGACG GCTATGACGG GCCACAGGGC GGCGGCTATC 

3551 CCGCTCCCAA AGGCGCGAGG GATATATACA GCTACGACAT AAAAGGCGTT 

50 3601 GCCCAAAATA TCCGCCTCAA CCTGACCGAC AACCGCAGCA CCGGACAACG 

3651 GCTTGCCGAC CGTTTCCACA ATGCCGGTAG TATGCTGACG CAAGGAGTAG 

3701 GCGACGGATT CAAACGCGCC ACCCGATACA GCCCCGAGCT GGACAGATCG 

3751 GGCAATGCCG CCGAAGCCTT CAACGGCACT GCAGATATCG TTAAAAACAT 

3801 CATCGGCGCG GCAGGAGAAA TTGTCGGCGC AGGCGATGCC GTGCAGGGCA 

55 3851 TAAGCGAAGG CTCAAACATT GCTGTCATGC ACGGCTTGGG TCTGCTTTCC 

3901 ACCGAAAACA AGATGGCGCG CATCAACGAT TTGGCAGATA TGGCGCAACT 

3951 CAAAGACTAT GCCGCAGCAG CCATCCGCGA TTGGGCAGTC CAAAACCCCA 

4001 ATGCCGCACA AGGCATAGAA GCCGTCAGCA ATATCTTTAT GGCAGCCATC 

4051 CCCATCAAAG GGATTGGAGC TGTTCGGGGA AAATACGGCT TGGGCGGCAT 

60 4101 CACGGCACAT CCTATCAAGC GGTCGCAGAT GGGCGCGATC GCATTGCCGA 

4151 AAGGGAAATC CGCCGTCAGC GACAATTTTG CCGATGCGGC ATACGCCAAA 

4201 TACCCGTCCC CTTACCATTC CCGAAATATC CGTTCAAACT TGGAGCAGCG 

4251 TTACGGCAAA GAAAACATCA CCTCCTCAAC CGTGCCGCCG TCAAACGGCA 

4301 AAAATGTCAA ACTGGCAGAC CAACGCCACC CGAAGACAGG CGTACCGTTT 

65 4351 GACGGTAAAG GGTTTCCGAA TTTTGAGAAG CACGTGAAAT ATGATACGCT 

4401 CGAGCACCAC CACCACCACC ACTGA 
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l 

51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 
601 
651 
701 
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851 
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1051 
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1151 
1201 
1251 
1301 
1351 
1401 
1451 



MTSAPDFNAG 
VAVTDRDAKI 
GIVDTGESVG 
ASFDDEAVTE 
ATLHIMNTND 
LFQIANSEEQ 
IFSTGNDAQA 
GTEPLBYGSN 
TAALLLQKYP 
PASFPFGDFT 
GKTIIEGGSL 
DQSGANETVH 
GAGYLNSTGR 
GDTLSYYVRR 
SSATPETVET 
ATVYADSTAA 
VEGKMRGSTQ 
AGIRHDAGDI 
GGVNVPFAAT 
LAGLKLSQPL 
, NMPHTRLVAG 
GGGGTGSSDL 
KIQSHQLGNL 
GSPVDGFSLY 
AQNIRLNLTD 
GNAAEAFNGT 
TENKMARIND 
PIKGIGAVRG 
YPSFYHSRNI 
DGKGFPNFEK 



GTGIGSNSRA 
NAPPPNLHTG 
SISFPELYGR 
TEAKPTDIRH 
ETKNEMMVAA 
YRQALLDYSG 
QPNTYALLPF 
HCGITAMWCL 
WMSNDNLRTT 
ADTKGTSDIA 
VLYGNNKSDM 
IKGSLQLDGK 
RVPFLSAAKI 
GNAARTASAA 
AAADRTDMPG 
HADMQGRRLK 
TVGIAAKTGE 
GYLKGLFSYG 
GDLTVEGGLR 
SDKAVLFATA 
LGADVEFGNG 
ANDSFIRQVL 
MIQQAAIKGN 
RIHWDGYEHH 
NRSTGQRIiAD 
ADIVKNIIGA 
LADMAQLKDY 
KYGLGGITAH 
RSNLEQRYGK 
HVKYDTIiEHH 



TTAKSAAVSY 
DFPNPNDAYK 
KEHGYNENYK 
VKEIGHIDLV 
IRNAWVKLGE 
GDKTDEGIRL 
YEKDAQKGII 
SAPYEASVRF 
LLTTAQDIGA 
YSFRNDISGT 
RVETKGALIY 
GTLYTRLGKL 
GQDYSFFTNI 
AHSAPAGLKH 
IRPYGATFRA 
AVSDGLDHNG 
NTTAAATLGM 
RYKNSISRST 
YDLLKQDAFA 
GVERDLNGRD 
WNGLARYSYA 
DRQHFEPDGK 
IGYIVRFSDH 
PADGYDGPQG 
RFHNAGSMLT 
AGEIVGAGDA 
AAAAIRDWAV 
PIKRSQMGAI 
ENITSSTVPP 
HHHH* 



AGIKNEMCKD 
NLINLKPAIE 
NYTAYMRKEA 
SHIIGGRSVD 
RGVRIVNNSF 
MQQSDYGNLS 
TVAGVDRSGE 
TRTNPIQIAG 
VGVDSKFGWG 
GGLIKKGGSQ 
NGAASGGSLN 
LKVDGTAIIG 
ETDGGLLASL 
AVEQGGSNLE 
AAAVQHANAA 
TGIiRVTAQTQ 
GRSTWSENSA 
GADEHAEGSV 
EKGSALGWSG 
YTVTGGFTGA 
GSKQYGNHSG 
YHLFGSRGEL 
GHEVHSPFDN 
GGYPAPKGAR 
QGVGDGFKRA 
VQGISEGSNI 
QNPNAAQGIE 
ALPKGKSAVS 
SNGKNVKLAD 



RSMLCAGRDD 
AGYTGRGVEV 
PEDGGGKDIE 
GRPAGGIAPD 
GTTSRAGTAD 
YHIRNKNMLF 
KFKREMYGEP 
TSFSAPIVTG 
LLDAGKAMNG 
LQLHGNNTYT 
SDGIVYLADT 
GKLYMSARGK 
DSVEKTAGSE 
NLMVELDASE 
DGVRIFNSLA 
QDGGTWEQGG 
NAKTDSISLF 
NGTLMQLGAIi 
NSLTEGTLVG 
TAATGKTGAR 
RVGVGYRFLD 
AERSGHIGLG 
HASHSDSDEA 
DIYSYDIKGV 
TRYSPELDRS 
AVMHGLGLLS 
AVSNIFMAAI 
DNFADAAYAK 
QRHPKTGVPF 



AG983-741 
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51 
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551 
601 
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751 
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1151 
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1401 
1451 
1501 
1551 
1601 
1651 



ATGACTTCTG 
CAGCAGAGCA 
AGAACGAAAT 
GTTGCGGTTA 
GCATACCGGA 
ACCTCAAACC 
GGTATCGTCG 
GTATGGCAGA 
CGTATATGCG 
GCTTCTTTCG 
TATCCGCCAC 
TTGGCGGGCG 
GCGACGCTAC 
GGTTGCAGCC 
GCATCGTCAA 
CTTTTCCAAA 
CTATTCCGGC 
GCGATTACGG 
ATCTTTTCGA 
ATTGCCATTT 
GCGTAGACCG 
GGTACAGAAC 
GTGGTGCCTG 
ACCCGATTCA 
ACGGCGGCTC 
GCGTACCACG 
ACAGCAAGTT 
CCCGCGTCCT 
CGATATTGCC 
TCAAAAAAGG 
GGCAAAACCA 
ATCGGATATG 
CATCCGGCGG 
GACCAATCCG 



CGCCCGACTT 
ACAACAGCGA 
GTGCAAAGAC 
CAGACAGGGA 
GACTTTCCAA 
TGCAATTGAA 
ACACAGGCGA 
AAAGAACACG 
GAAGGAAGCG 
ACGATGAGGC 
GTAAAAGAAA 
TTCCGTGGAC 
ACATAATGAA 
ATCCGCAATG 
TAACAGTTTT 
TAGCCAATTC 
GGTGATAAAA 
CAACCTGTCC 
CAGGCAATGA 
TATGAAAAAG 
CAGTGGAGAA 
CGCTTGAGTA 
TCGGCACCCT 
AATTGCCGGA 
TGCTGCTGCA 
TTGCTGACGA 
CGGCTGGGGA 
TTCCGTTCGG 
TACTCCTTCC 
CGGCAGCCAA 
TTATCGAAGG 
CGCGTCGAAA 
CAGCCTGAAC 
GCGCAAACGA 



CAATGCAGGC 
AATCAGCAGC 
AGAAGCATGC 
TGCCAAAATC 
ACCCAAATGA 
GCAGGCTATA 
ATCCGTCGGC 
GCTATAACGA 
CCTGAAGACG 
CGTTATAGAG 
TCGGACACAT 
GGCAGACCTG 
TACGAATGAT 
CATGGGTCAA 
GGAACAACAT 
GGAGGAGCAG 
CAGACGAGGG 
TACCACATCC 
CGCACAAGCT 
ACGCTCAAAA 
AAGTTCAAAC 
TGGCTCCAAC 
ATGAAGCAAG 
ACATCCTTTT 
GAAATACCCG 
CGGCTCAGGA 
CTGCTGGATG 
CGACTTTACC 
GTAACGACAT 
CTGCAACTGC 
CGGTTCGCTG 
CCAAAGGTGC 
AGCGACGGCA 
AACCGTACAC 



GGTACCGGTA 
AGTATCTTAC 
TCTGTGCCGG 
AATGCCCCCC 
CGCATACAAG 
CAGGACGCGG 
AGCATATCCT 
AAATTACAAA 
GAGGCGGTAA 
ACTGAAGCAA 
CGATTTGGTC 
CAGGCGGTAT 
GAAACCAAGA 
GCTGGGCGAA 
CGAGGGCAGG 
TACCGCCAAG 
TATCCGCCTG 
GTAATAAAAA 
CAGCCCAACA 
AGGCATTATC 
GGGAAATGTA 
CATTGCGGAA 
CGTCCGTTTC 
CCGCACCCAT 
TGGATGAGCA 
CATCGGTGCA 
CGGGTAAGGC 
GCCGATACGA 
TTCAGGCACG 
ACGGCAACAA 
GTGTTGTACG 
GCTGATTTAT 
TTGTCTATCT 
ATCAAAGGCA 



TCGGCAGCAA 
GCCGGTATCA 
TCGGGATGAC 
CCCCGAATCT 
AATTTGATCA 
GGTAGAGGTA 
TTCCCGAACT 
AACTATACGG 
AGACATTGAA 
AGCCGACGGA 
TCCCATATTA 
TGCGCCCGAT 
ACGAAATGAT 
CGTGGCGTGC 
CACTGCCGAC 
CGTTGCTCGA 
ATGCAACAGA 
CATGCTTTTC 
CATATGCCCT 
ACAGTCGCAG 
TGGAGAACCG 
TTACTGCCAT 
ACCCGTACAA 
CGTAACCGGC 
ACGACAACCT 
GTCGGCGTGG 
CATGAACGGA 
AAGGTACATC 
GGCGGCCTGA 
CACCTATACG 
GCAACAACAA 
AACGGGGCGG 
GGCAGATACC 
GTCTGCAGCT 
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1701 GGACGGCAAA GGTACGCTGT ACACACGTTT GGGCAAACTG CTGAAAGTGG 

1751 ACGGTACGGC GATTATCGGC GGCAAGCTGT ACATGTCGGC ACGCGGCAAG 

1801 GGGGCAGGCT ATCTCAACAG TACCGGACGA CGTGTTCCCT TCCTGAGTGC 

1851 CGCCAAAATC GGGCAGGATT ATTCTTTCTT CACAAACATC GAAACCGACG 

5 1901 GCGGCCTGCT GGCTTCCCTC GACAGCGTCG AAAAAACAGC GGGCAGTGAA 

1951 GGCGACACGC TGTCCTATTA TGTCCGTCGC GGCAATGCGG CACGGACTGC 

2001 TTCGGCAGCG GCACATTCCG CGCCCGCCGG TCTGAAACAC GCCGTAGAAC 

2051 AGGGCGGCAG CAATCTGGAA AACCTGATGG TCGAACTGGA TGCCTCCGAA 

2101 TCATCCGCAA CACCCGAGAC GGTTGAAACT GCGGCAGCCG ACCGCACAGA 

10 2151 TATGCCGGGC ATCCGCCCCT ACGGCGCAAC TTTCCGCGCA GCGGCAGCCG 

2201 TACAGCATGC GAATGCCGCC GACGGTGTAC GCATCTTCAA CAGTCTCGCC 

2251 GCTACCGTCT ATGCCGACAG TACCGCCGCC CATGCCGATA TGCAGGGACG 

2301 CCGCCTGAAA GCCGTATCGG ACGGGTTGGA CCACAACGGC ACGGGTCTGC 

2351 GCGTCATCGC GCAAACCCAA CAGGACGGTG GAACGTGGGA ACAGGGCGGT 

15 2401 GTTGAAGGCA AAATGCGCGG CAGTACCCAA ACCGTCGGCA TTGCCGCGAA 

2451 AACCGGCGAA AATACGACAG CAGCCGCCAC ACTGGGCATG GGACGCAGCA 

2501 CATGGAGCGA AAACAGTGCA AATGCAAAAA CCGACAGCAT TAGTCTGTTT 

2551 GCAGGCATAC GGCACGATGC GGGCGATATC GGCTATCTCA AAGGCCTGTT 

2601 CTCCTACGGA CGCTACAAAA ACAGCATCAG CCGCAGCACC GGTGCGGACG 

20 2651 AACATGCGGA AGGCAGCGTC AACGGCACGC TGATGCAGCT GGGCGCACTG 

2701 GGCGGTGTCA ACGTTCCGTT TGCCGCAACG GGAGATTTGA CGGTCGAAGG 

2751 CGGTCTGCGC TACGACCTGC TCAAACAGGA TGCATTCGCC GAAAAAGGCA 

2801 GTGCTTTGGG CTGGAGCGGC AACAGCCTCA CTGAAGGCAC GCTGGTCGGA 

2851 CTCGCGGGTC TGAAGCTGTC GCAACCCTTG AGCGATAAAG CCGTCCTGTT 

25 2901 TGCAACGGCG GGCGTGGAAC GCGACCTGAA CGGACGCGAC TACACGGTAA 

2951 CGGGCGGCTT TACCGGCGCG ACTGCAGCAA CCGGCAAGAC GGGGGCACGC 

3001 AATATGCCGC ACACCCGTCT GGTTGCCGGC CTGGGCGCGG ATGTCGAATT 

3051 CGGCAACGGC TGGAACGGCT TGGCACGTTA CAGCTACGCC GGTTCCAAAC 

3101 AGTACGGCAA CCACAGCGGA CGAGTCGGCG TAGGCTACCG GTTCCTCGAG 

30 3151 GGATCCGGAG GGGGTGGTGT CGCCGCCGAC ATCGGTGCGG GGCTTGCCGA 

3201 TGCACTAACC GCACCGCTCG ACCATAAAGA CAAAGGTTTG CAGTCTTTGA 

3251 CGCTGGATCA GTCCGTCAGG AAAAACGAGA AACTGAAGCT GGCGGCACAA 

3301 GGTGCGGAAA AAACTTATGG AAACGGTGAC AGCCTCAATA CGGGCAAATT 

3351 GAAGAACGAC AAGGTCAGCC GTTTCGACTT TATCCGCCAA ATCGAAGTGG 

35 3401 ACGGGCAGCT CATTACCTTG GAGAGTGGAG AGTTCCAAGT ATACAAACAA 

3451 AGCCATTCCG CCTTAACCGC CTTTCAGACC GAGCAAATAC AAGATTCGGA 

3501 GCATTCCGGG AAGATGGTTG CGAAACGCCA GTTCAGAATC GGCGACATAG 

3551 CGGGCGAACA TACATCTTTT GACAAGCTTC CCGAAGGCGG CAGGGCGACA 

3601 TATCGCGGGA CGGCGTTCGG TTCAGACGAT GCCGGCGGAA AACTGACCTA 

40 3651 CACCATAGAT TTCGCCGCCA AGCAGGGAAA CGGCAAAATC GAACATTTGA 

3701 AATCGCCAGA ACTCAATGTC GACCTGGCCG CCGCCGATAT CAAGCCGGAT 

3751 GGAAAACGCC ATGCCGTCAT CAGCGGTTCC GTCCTTTACA ACCAAGCCGA 

3801 GAAAGGCAGT TACTCCCTCG GTATCTTTGG CGGAAAAGCC CAGGAAGTTG 

3851 CCGGCAGCGC GGAAGTGAAA ACCGTAAACG GCATACGCCA TATCGGCCTT 

45 3901 GCCGCCAAGC AACTCGAGCA CCACCACCAC CACCACTGA 

1 MTSAPDFNAG GTGIGSNSRA TTAKSAAVSY AGIKNEMCKD RSMLCAGRDD 

51 VAVTDRDAKI NAPPPNLHTG DFPNPNDAYK NLINLKPAIE AGYTGRGVEV 

101 GIWTGBSVG SISFPELYGR KEHGYNENYK NYTAYMRKEA PEDGGGKDIE 

50 151 ASFDDEAVIE TEAKPTDIRH VKEIGHIDLV SHIIGGRSVD GRPAGGIAPD 

201 ATLHIMNTND ETKNEMMVAA IRNAWVKLGE RGVRIVNNSF GTTSRAGTAD 

251 LFQIANSEEQ YRQALLDYSG GDKTDEGIRL MQQSDYGNLS YHIRNKNMLF 

301 IFSTGNDAQA QPNTYALLPF YEKDAQKGII TVAGVDRSGE KFKREMYGEP 

351 GTEPLEYGSN HCGITAMWCL SAPYEASVRF TRTNPIQIAG TSFSAPIVTG 

55 401 TAALLLQKYP WMSNDNLRTT LLTTAQDIGA VGVDSKFGWG LLDAGKAMNG 

451 PASFPFGDFT ADTKGTSDIA YSFRNDISGT GGLIKKGGSQ LQLHGNNTYT 

501 GKTIIEGGSL VLYGNNKSDM RVETKGALIY NGAASGGSLN SDGIVYLADT 

551 DQSGANETVH IKGSLQLDGK GTLYTRLGKL LKVDGTAIIG GKLYMSARGK 

601 GAGYLNSTGR RVPFLSAAKI GQDYSFFTNI ETDGGLLASL DSVEKTAGSE 

60 651 GDTLSYYVRR GNAARTASAA AHSAPAGLKH AVEQGGSNLE NLMVELDASE 

701 SSATPETVET AAADRTDMPG IRPYGATFRA AAAVQHANAA DGVRIFNSLA 

751 ATVYADSTAA HADMQGRRLK AVSDGLDHNG TGLRVIAQTQ QDGGTWEQGG 

801 VEGKMRGSTQ TVGIAAKTGE NTTAAATLGM GRSTWSENSA NAKTDSISLF 

851 AGIRHDAGDI GYLKGLFSYG RYKNSISRST GADEHAEGSV NGTLMQLGAL 

65 901 GGVNVPFAAT GDLTVEGGLR YDLLKQDAFA EKGSALGWSG NSLTEGTLVG 

951 LAGLKLSQPL SDKAVLFATA GVERDLNGRD YTVTGGFTGA TAATGKTGAR 

1001 NMPHTRLVAG LGADVEFGNG WNGLARYSYA GSKQYGNHSG RVGVGYRFLE 
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1051 
1101 
1151 
1201 
1251 
1301 



GSGGGGVAAD 
GAEKTYGNGD 
SHSALTAFQT 
YRGTAFGSDD 
GKRHAVISGS 
AAKQLEHHHH 



IGAGLADALT 
SLNTGKLKND 
EQIQDSEHSG 
AGGKLTYTID 
VLYNQAEKGS 
HH* 



APLDHKDKGL 
KVSRFDFIRQ 
KMVAKRQFRI 
FAAKQGNGKI 
YSLGIFGGKA 



QSLTLDQSVR 
IEVDGQLITL 
GDIAGEHTSF 
EHLKSPELNV 
QEVAGSAEVK 



KNEKLKLAAQ 
ESGEFQVYKQ 
DKLPEGGRAT 
DLAAADIKPD 
TVNGIRHIGL 
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65 



AG983-961 



1 
51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 
601 
651 
701 
751 
801 
851 
901 
951 
1001 
1051 
1101 
1151 
1201 
1251 
1301 
1351 
1401 
1451 
1501 
1551 
1601 
1651 
1701 
1751 
1801 
1851 
1901 
1951 
2001 
2051 
2101 
2151 
2201 
2251 
2301 
2351 
2401 
2451 
2501 
2551 
2601 
2651 
2701 
2751 
2801 
2851 



ATGACTTCTG 
CAGCAGAGCA 
AGAACGAAAT 
GTTGCGGTTA 
GCATACCGGA 
ACCTCAAACC 
GGTATCGTCG 
GTATGGCAGA 
CGTATATGCG 
GCTTCTTTCG 
TATCCGCCAC 
TTGGCGGGCG 
GCGACGCTAC 
GGTTGCAGCC 
GCATCGTCAA 
CTTTTCCAAA 
CTATTCCGGC 
GCGATTACGG 
ATCTTTTCGA 
ATTGCCATTT 
GCGTAGACCG 
GGTACAGAAC 
GTGGTGCCTG 
ACCCGATTCA 
ACGGCGGCTC 
GCGTACCACG 
ACAGCAAGTT 
CCCGCGTCCT 
CGATATTGCC 
TCAAAAAAGG 
GGCAAAACCA 
ATCGGATATG 
CATCCGGCGG 
GACCAATCCG 
GGACGGCAAA 
ACGGTACGGC 
GGGGCAGGCT 
CGCCAAAATC 
GCGGCCTGCT 
GGCGACACGC 
TTCGGCAGCG 
AGGGCGGCAG 
TCATCCGCAA 
TATGCCGGGC 
TACAGCATGC 
GCTACCGTCT 
CCGCCTGAAA 
GCGTCATCGC 
GTTGAAGGCA 
AACCGGCGAA 
CATGGAGCGA 
GCAGGCATAC 
CTCCTACGGA 
AACATGCGGA 
GGCGGTGTCA 
GGGTCTGCGC 
GTGCTTTGGG 
CTCGCGGGTC 



CGCCCGACTT 
ACAACAGCGA 
GTGCAAAGAC 
CAGACAGGGA 
GACTTTCCAA 
TGCAATTGAA 
ACACAGGCGA 
AAAGAACACG 
GAAGGAAGCG 
ACGATGAGGC 
GTAAAAGAAA 
TTCCGTGGAC 
ACATAATGAA 
ATCCGCAATG 
TAACAGTTTT 
TAGCCAATTC 
GGTGATAAAA 
CAACCTGTCC 
CAGGCAATGA 
TATGAAAAAG 
CAGTGGAGAA 
CGCTTGAGTA 
TCGGCACCCT 
AATTGCCGGA 
TGCTGCTGCA 
TTGCTGACGA 
CGGCTGGGGA 
TTCCGTTCGG 
TACTCCTTCC 
CGGCAGCCAA 
TTATCGAAGG 
CGCGTCGAAA 
CAGCCTGAAC 
GCGCAAACGA 
GGTACGCTGT 
GATTATCGGC 
ATCTCAACAG 
GGGCAGGATT 
GGCTTCCCTC 
TGTCCTATTA 
GCACATTCCG 
CAATCTGGAA 
CACCCGAGAC 
ATCCGCCCCT 
GAATGCCGCC 
ATGCCGACAG 
GCCGTATCGG 
GCAAACCCAA 
AAATGCGCGG 
AATACGACAG 
AAACAGTGCA 
GGCACGATGC 
CGCTACAAAA 
AGGCAGCGTC 
ACGTTCCGTT 
TACGACCTGC 
CTGGAGCGGC 
TGAAGCTGTC 



CAATGCAGGC 
AATCAGCAGC 
AGAAGCATGC 
TGCCAAAATC 
ACCCAAATGA 
GCAGGCTATA 
ATCCGTCGGC 
GCTATAACGA 
CCTGAAGACG 
CGTTATAGAG 
TCGGACACAT 
GGCAGACCTG 
TACGAATGAT 
CATGGGTCAA 
GGAACAACAT 
GGAGGAGCAG 
CAGACGAGGG 
TACCACATCC 
CGCACAAGCT 
ACGCTCAAAA 
AAGTTCAAAC 
TGGCTCCAAC 
ATGAAGCAAG 
ACATCCTTTT 
GAAATACCCG 
CGGCTCAGGA 
CTGCTGGATG 
CGACTTTACC 
GTAACGACAT 
CTGCAACTGC 
CGGTTCGCTG 
CCAAAGGTGC 
AGCGACGGCA 
AACCGTACAC 
ACACACGTTT 
GGCAAGCTGT 
TACCGGACGA 
ATTCTTTCTT 
GACAGCGTCG 
TGTCCGTCGC 
CGCCCGCCGG 
AACCTGATGG 
GGTTGAAACT 
ACGGCGCAAC 
GACGGTGTAC 
TACCGCCGCC 
ACGGGTTGGA 
CAGGACGGTG 
CAGTACCCAA 
CAGCCGCCAC 
AATGCAAAAA 
GGGCGATATC 
ACAGCATCAG 
AACGGCACGC 
TGCCGCAACG 
TCAAACAGGA 
AACAGCCTCA 
GCAACCCTTG 



GGTACCGGTA 
AGTATCTTAC 
TCTGTGCCGG 
AATGCCCCCC 
CGCATACAAG 
CAGGACGCGG 
AGCATATCCT 
AAATTACAAA 
GAGGCGGTAA 
ACTGAAGCAA 
CGATTTGGTC 
CAGGCGGTAT 
GAAACCAAGA 
GCTGGGCGAA 
CGAGGGCAGG 
TACCGCCAAG 
TATCCGCCTG 
GTAATAAAAA 
CAGCCCAACA 
AGGCATTATC 
GGGAAATGTA 
CATTGCGGAA 
CGTCCGTTTC 
CCGCACCCAT 
TGGATGAGCA 
CATCGGTGCA 
CGGGTAAGGC 
GCCGATACGA 
TTCAGGCACG 
ACGGCAACAA 
GTGTTGTACG 
GCTGATTTAT 
TTGTCTATCT 
ATCAAAGGCA 
GGGCAAACTG 
ACATGTCGGC 
CGTGTTCCCT 
CACAAACATC 
AAAAAACAGC 
GGCAATGCGG 
TCTGAAACAC 
TCGAACTGGA 
GCGGCAGCCG 
TTTCCGCGCA 
GCATCTTCAA 
CATGCCGATA 
CCACAACGGC 
GAACGTCGGA 
ACCGTCGGCA 
ACTGGGCATG 
CCGACAGCAT 
GGCTATCTCA 
CCGCAGCACC 
TGATGCAGCT 
GGAGATTTGA 
TGCATTCGCC 
CTGAAGGCAC 
AGCGATAAAG 



TCGGCAGCAA 
GCCGGTATCA 
TCGGGATGAC 
CCCCGAATCT 
AATTTGATCA 
GGTAGAGGTA 
TTCCCGAACT 
AACTATACGG 
AGACATTGAA 
AGCCGACGGA 
TCCCATATTA 
TGCGCCCGAT 
ACGAAATGAT 
CGTGGCGTGC 
CACTGCCGAC 
CGTTGCTCGA 
ATGCAACAGA 
CATGCTTTTC 
CATATGCCCT 
ACAGTCGCAG 
TGGAGAACCG 
TTACTGCCAT 
ACCCGTACAA 
CGTAACCGGC 
ACGACAACCT 
GTCGGCGTGG 
CATGAACGGA 
AAGGTACATC 
GGCGGCCTGA 
CACCTATACG 
GCAACAACAA 
AACGGGGCGG 
GGCAGATACC 
GTCTGCAGCT 
CTGAAAGTGG 
ACGCGGCAAG 
TCCTGAGTGC 
GAAACCGACG 
GGGCAGTGAA 
CACGGACTGC 
GCCGTAGAAC 
TGCCTCCGAA 
ACCGCACAGA 
GCGGCAGCCG 
CAGTCTCGCC 
TGCAGGGACG 
ACGGGTCTGC 
ACAGGGCGGT 
TTGCCGCGAA 
GGACGCAGCA 
TAGTCTGTTT 
AAGGCCTGTT 
GGTGCGGACG 
GGGCGCACTG 
CGGTCGAAGG 
GAAAAAGGCA 
GCTGGTCGGA 
CCGTCCTGTT 
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2901 
2951 
3001 
3051 
3101 
3151 
3201 
3251 
3301 
3351 
3401 
3451 
3501 
3551 
3601 
3651 
3701 
3751 
3801 
3851 
3901 
3951 
4001 
4051 
4101 
4151 
4201 
4251 
4301 

1 
51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 
601 
651 
701 
751 
801 
851 
901 
951 
1001 
1051 
1101 
1151 
1201 
1251 
1301 
1351 
1401 



TGCAACGGCG 
CGGGCGGCTT 
AATATGCCGC 
CGGCAACGGC 
AGTACGGCAA 
GGTGGCGGAG 
TGCCACTGTG 
GTTTCAAAGC 
ACCAAAAAAG 
TCTGGGTCTG 
ACAAACAAAA 
AAGTTAACAA 
TGCCGCTCTG 
TAACGACATT 
AAATTAGAAG 
CGATATCGCC 
TCAAAACCGC 
GTCGATGCCA 
TGCCGCTGGC 
CAAAAGTTAC 
GCTAAAAAAG 
CAAATTTGTC 
CACGCTTGGC 
AACGGTTTGG 
CCTTGCAGAA 
GTCGGTTCAA 
GTCGCCATCG 
AGGCGTGGCA 
GCGTCAATTA 

MTSAPDFNAG 
VAVTDRDAKI 
GIVDTGESVG 
ASFDDEAVIB 
ATLHIMNTND 
LFQIANSEEQ 
IFSTGNDAQA 
GTEPLEYGSN 
TAALLLQKYP 
PASFPFGDFT 
GKTIIEGGSL 
DQSGANETVH 
GAGYIiNSTGR 
GDTLSYYVRR 
SSATPETVET 
ATVYADSTAA 
VEGKMRGSTQ 
AGIRHDAGDI 
GGVNVPFAAT 
LAGIiKLSQPL 
NMPHTRLVAG 
GGGGTGSATN 
TKKDATAADV 
KLTTKLADTD 
KLEAVADTVD 
VDAKVKAAET 
AKKANSADVY 
NGLDKTVSDL 
VAIGTGFRFT 



GGCGTGGAAC 
TACCGGCGCG 
ACACCCGTCT 
TGGAACGGCT 
CCACAGCGGA 
GCACTGGATC 
GCCATTGCTG 
TGGAGAGACC 
ACGCAACTGC 
AAAAAAGTCG 
CGTCGATGCC 
CCAAGTTAGC 
GATGCAACCA 
TGCTGAAGAG 
CCGTGGCTGA 
GATTCATTGG 
CAATGAAGCC 
AAGTAAAAGC 
ACAGCTAATA 
CGACATCAAA 
CAAACAGTGC 
AGAATTGATG 
TTCTGCTGAA 
ATAAAACAGT 
CAAGCCGCGC 
TGTAACGGCT 
GTACCGGCTT 
GTCGGCACTT 
CGAGTGGCTC 

GTGIGSNSRA 
NAPPPNLHTG 
SISFPELYGR 
TEAKPTDIRH 
ETKNEMMVAA 
YRQALLDYSG 
QPNTYALIiPF 
HCGITAMWCL 
WMSNDNLRTT 
ADTKGTSDIA 
VLYGNNKSDM 
IKGSLQLDGK 
RVPFLSAAKI 
GNAARTASAA 
AAADRTDMPG 
HADMQGRRLK 
TVGIAAKTGE 
GYLKGLFSYG 
GDLTVEGGLR 
SDKAVLFATA 
LGADVKFGNG 
DDDVKKAATV 
EADDFKGLGL 
AALADTDAAL 
KHAEAFNDIA 
AAGKAEAAAG 
TREESDSKFV 
RKETRQGLAE 
ENFAAKAGVA 



GCGACCTGAA 
ACTGCAGCAA 
GGTTGCCGGC 
TGGCACGTTA 
CGAGTCGGCG 
CGCCACAAAC 
CTGCCTACAA 
ATCTACGACA 
AGCCGATGTT 
TGACTAACCT 
AAAGTAAAAG 
AGACACTGAT 
CCAACGCCTT 
ACTAAGACAA 
TACCGTCGAC 
ATGAAACCAA 
AAACAGACGG 
TGCAGAAACT 
CTGCAGCCGA 
GCTGATATCG 
CGACGTGTAC 
GTCTGAACGC 
AAATCCATTG 
GTCAGACCTG 
TCTCCGGTCT 
GCAGTCGGCG 
CCGCTTTACC 
CGTCCGGTTC 
GAGCACCACC 

TTAKSAAVSY 
DFPNPNDAYK 
KEHGYNENYK 
VKEIGHIDLV 
IRNAWVKLGE 
GDKTDEGIRL 
YEKDAQKGII 
SAPYEASVRF 
LLTTAQDIGA 
YSFRNDISGT 
RVETKGALIY 
GTLYTRLGKL 
GQDYSFFTNI 
AHSAPAGLKH 
IRPYGATFRA 
AVSDGLDHNG 
NTTAAATLGM 
RYKNSISRST 
YDLLKQDAFA 
GVERDLNGRD 
WNGLARYSYA 
AIAAAYNNGQ 
KKWTNLTKT 
DATTNALNKL 
DSLDETNTKA 
TANTAADKAE 
RIDGLNATTE 
QAALSGLFQP 
VGTSSGSSAA 



CGGACGCGAC 
CCGGCAAGAC 
CTGGGCGCGG 
CAGCTACGCC 
TAGGCTACCG 
GACGACGATG 
CAATGGCCAA 
TTGATGAAGA 
GAAGCCGACG 
GACCAAAACC 
CTGCAGAATC 
GCCGCTTTAG 
GAATAAATTG 
ATATCGTAAA 
AAGCATGCCG 
CACTAAGGCA 
CCGAAGAAAC 
GCAGCAGGCA 
CAAGGCCGAA 
CTACGAACAA 
ACCAGAGAAG 
TACTACCGAA 
CCGATCACGA 
CGCAAAGAAA 
GTTCCAACCT 
GCTACAAATC 
GAAAACTTTG 
TTCCGCAGCC 
ACCACCACCA 

AGIKNEMCKD 
NLINLKPAIE 
NYTAYMRKEA 
SHIIGGRSVD 
RGVRIVNNSF 
MQQSDYGNLS 
TVAGVDRSGE 
TRTNPIQIAG 
VGVDSKFGWG 
GGLIKKGGSQ 
NGAASGGSLN 
LKVDGTAIIG 
ETDGGLLASL 
AVEQGGSNLE 
AAAVQHANAA 
TGLRVIAQTQ 
GRSTWSENSA 
GADEHAEGSV 
EKGSALGWSG 
YTVTGGFTGA 
GSKQYGNHSG 
EINGFKAGET 
VNENKQNVDA 
GENITTFAEE 
DEAVKTANEA 
AVAAKVTDIK 
KLDTRLASAE 
YNVGRFNVTA 
YHVGVNYEWL 



TACACGGTAA 
GGGGGCACGC 
ATGTCGAATT 
GGTTCCAAAC 
GTTCCTCGAG 
TTAAAAAAGC 
GAAATCAACG 
CGGCACAATT 
ACTTTAAAGG 
GTCAATGAAA 
TGAAATAGAA 
CAGATACTGA 
GGAGAAAATA 
AATTGATGAA 
AAGCATTCAA 
GACGAAGCCG 
CAAACAAAAC 
AAGCCGAAGC 
GCTGTCGCTG 
AGATAATATT 
AGTCTGACAG 
AAATTGGACA 
TACTCGCCTG 
CCCGCCAAGG 
TACAACGTGG 
CGAATCGGCA 
CCGCCAAAGC 
TACCATGTCG 
CTGA 

RSMLCAGRDD 
AGYTGRGVEV 
PEDGGGKDIE 
GRPAGGIAPD 
GTTSRAGTAD 
YHIKNKNMLF 
KFKREMYGEP 
TSFSAPIVTG 
LLDAGKAMNG 
LQLHGNNTYT 
SDGIVYLADT 
GKLYMSARGK 
DSVEKTAGSE 
NLMVELDASE 
DGVRIFNSLA 
QDGGTWEQGG 
NAKTDSISLF 
NGTLMQLGAL 
NSLTEGTLVG 
TAATGKTGAR 
RVGVGYRFUS 
IYDIDEDGTI 
KVKAAESEIE 
TKTNIVKIDE 
KQTAEETKQN 
ADIATNKDNI 
KSIADHDTRb 
AVGGYKSESA 
EHHHHHH* 
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AG983-961C 

1 ATGACTTCTG CGCCCGACTT CAATGCAGGC GGTACCGGTA TCGGCAGCAA 
51 CAGCAGAGCA ACAACAGCGA AATCAGCAGC AGTATCTTAC GCCGGTATCA 
101 AGAACGAAAT GTGCAAAGAC AGAAGCATGC TCTGTGCCGG TCGGGATGAC 
151 GTTGCGGTTA CAGACAGGGA TGCCAAAATC AATGCCCCCC CCCCGAATCT 
201 GCATACCGGA GACTTTCCAA ACCCAAATGA CGCATACAAG AATTTGATCA 
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251 ACCTCAAACC TGCAATTGAA GCAGGCTATA CAGGACGCGG GGTAGAGGTA 

301 GGTATCGTCG ACACAGGCGA ATCCGTCGGC AGCATATCCT TTCCCGAACT 

351 GTATGGCAGA AAAGAACACG GCTATAACGA AAATTACAAA AACTATACGG 

401 CGTATATGCG GAAGGAAGCG CCTGAAGACG GAGGCGGTAA AGACATTGAA 

5 451 GCTTCTTTCG ACGATGAGGC CGTTATAGAG ACTGAAGCAA AGCCGACGGA 

501 TATCCGCCAC GTAAAAGAAA TCGGACACAT CGATTTGGTC TCCCATATTA 

551 TTGGCGGGCG TTCCGTGGAC GGCAGACCTG CAGGCGGTAT TGCGCCCGAT. 

601 GCGACGCTAC ACATAATGAA TACGAATGAT GAAACCAAGA ACGAAATGAT 

651 GGTTGCAGCC ATCCGCAATG CATGGGTCAA GCTGGGCGAA CGTGGCGTGC 

10 701 GCATCGTCAA TAACAGTTTT GGAACAACAT CGAGGGCAGG CACTGCCGAC 

751 CTTTTCCAAA TAGCCAATTC GGAGGAGCAG TACCGCCAAG CGTTGCTCGA 

801 CTATTCCGGC GGTGATAAAA CAGACGAGGG TATCCGCCTG ATGCAACAGA 

851 GCGATTACGG CAACCTGTCC TACCACATCC GTAATAAAAA CATGCTTTTC 

901 ATCTTTTCGA CAGGCAATGA CGCACAAGCT CAGCCCAACA CATATGCCCT 

15 951 ATTGCCATTT TATGAAAAAG ACGCTCAAAA AGGCATTATC ACAGTCGCAG 

1001 GCGTAGACCG CAGTGGAGAA AAGTTCAAAC GGGAAATGTA TGGAGAACCG 

1051 GGTACAGAAC CGCTTGAGTA TGGCTCCAAC CATTGCGGAA TTACTGCCAT 

1101 GTGGTGCCTG TCGGCACCCT ATGAAGCAAG CGTCCGTTTC ACCCGTACAA 

1151 ACCCGATTCA AATTGCCGGA ACATCCTTTT CCGCACCCAT CGTAACCGGC 

20 1201 ACGGCGGCTC TGCTGCTGCA GAAATACCCG TGGATGAGCA ACGACAACCT 

1251 GCGTACCACG TTGCTGACGA CGGCTCAGGA CATCGGTGCA GTCGGCGTGG 

1301 ACAGCAAGTT CGGCTGGGGA CTGCTGGATG CGGGTAAGGC CATGAACGGA 

1351 CCCGCGTCCT TTCCGTTCGG CGACTTTACC GCCGATACGA AAGGTACATC 

1401 CGATATTGCC TACTCCTTCC GTAACGACAT TTCAGGCACG GGCGGCCTGA 

25 1451 TCAAAAAAGG CGGCAGCCAA CTGCAACTGC ACGGCAACAA CACCTATACG 

1501 GGCAAAACCA TTATCGAAGG CGGTTCGCTG GTGTTGTACG GCAACAACAA 

1551 ATCGGATATG CGCGTCGAAA CCAAAGGTGC GCTGATTTAT AACGGGGCGG 

1601 CATCCGGCGG CAGCCTGAAC AGCGACGGCA TTGTCTATCT GGCAGATACC 

1651 GACCAATCCG GCGCAAACGA AACCGTACAC ATCAAAGGCA GTCTGCAGCT 

30 1701 GGACGGCAAA GGTACGCTGT ACACACGTTT GGGCAAACTG CTGAAAGTGG 

1751 ACGGTACGGC GATTATCGGC GGCAAGCTGT ACATGTCGGC ACGCGGCAAG 

1801 GGGGCAGGCT ATCTCAACAG TACCGGACGA CGTGTTCCCT TCCTGAGTGC 

1851 CGCCAAAATC GGGCAGGATT ATTCTTTCTT CACAAACATC GAAACCGACG 

1901 GCGGCCTGCT GGCTTCCCTC GACAGCGTCG AAAAAACAGC GGGCAGTGAA 

35 1951 GGCGACACGC TGTCCTATTA TGTCCGTCGC GGCAATGCGG CACGGACTGC 

2001 TTCGGCAGCG GCACATTCCG CGCCCGCCGG TCTGAAACAC GCCGTAGAAC 

2051 AGGGCGGCAG CAATCTGGAA AACCTGATGG TCGAACTGGA TGCCTCCGAA 

2101 TCATCCGCAA CACCCGAGAC GGTTGAAACT GCGGCAGCCG ACCGCACAGA 

2151 TATGCCGGGC ATCCGCCCCT ACGGCGCAAC TTTCCGCGCA GCGGCAGCCG 

40 2201 TACAGCATGC GAATGCCGCC GACGGTGTAC GCATCTTCAA CAGTCTCGCC 

2251 GCTACCGTCT ATGCCGACAG TACCGCCGCC CATGCCGATA TGCAGGGACG 

2301 CCGCCTGAAA GCCGTATCGG ACGGGTTGGA CCACAACGGC ACGGGTCTGC 

2351 GCGTCATCGC GCAAACCCAA CAGGACGGTG GAACGTGGGA ACAGGGCGGT 

2401 GTTGAAGGCA AAATGCGCGG CAGTACCCAA ACCGTCGGCA TTGCCGCGAA 

45 2451 AACCGGCGAA AATACGACAG CAGCCGCCAC ACTGGGCATG GGACGCAGCA 

2501 CATGGAGCGA AAACAGTGCA AATGCAAAAA CCGACAGCAT TAGTCTGTTT 

2551 GCAGGCATAC GGCACGATGC GGGCGATATC GGCTATCTCA AAGGCCTGTT 

2601 CTCCTACGGA CGCTACAAAA ACAGCATCAG CCGCAGCACC GGTGCGGACG 

2651 AACATGCGGA AGGCAGCGTC AACGGCACGC TGATGCAGCT GGGCGCACTG 

50 2701 GGCGGTGTCA ACGTTCCGTT TGCCGCAACG GGAGATTTGA CGGTCGAAGG 

2751 CGGTCTGCGC TACGACCTGC TCAAACAGGA TGCATTCGCC GAAAAAGGCA 

2801 GTGCTTTGGG CTGGAGCGGC AACAGCCTCA CTGAAGGCAC GCTGGTCGGA 

2851 CTCGCGGGTC TGAAGCTGTC GCAACCCTTG AGCGATAAAG CCGTCCTGTT 

2901 TGCAACGGCG GGCGTGGAAC GCGACCTGAA CGGACGCGAC TACACGGTAA 

55 2951 CGGGCGGCTT TACCGGCGCG ACTGCAGCAA CCGGCAAGAC GGGGGCACGC 

3001 AATATGCCGC ACACCCGTCT GGTTGCCGGC CTGGGCGCGG ATGTCGAATT 

3051 CGGCAACGGC TGGAACGGCT TGGCACGTTA CAGCTACGCC GGTTCCAAAC 

3101 AGTACGGCAA CCACAGCGGA CGAGTCGGCG TAGGCTACCG GTTCCTCGAG 

3151 GGTGGCGGAG GCACTGGATC CGCCACAAAC GACGACGATG TTAAAAAAGC 

60 3201 TGCCACTGTG GCCATTGCTG CTGCCTACAA CAATGGCCAA GAAATCAACG 

3251 GTTTCAAAGC TGGAGAGACC ATCTACGACA TTGATGAAGA CGGCACAATT 

3301 ACCAAAAAAG ACGCAACTGC AGCCGATGTT GAAGCCGACG ACTTTAAAGG 

3351 TCTGGGTCTG AAAAAAGTCG TGACTAACCT GACCAAAACC GTCAATGAAA 

3401 ACAAACAAAA CGTCGATGCC AAAGTAAAAG CTGCAGAATC TGAAATAGAA 

65 3451 AAGTTAACAA CCAAGTTAGC AGACACTGAT GCCGCTTTAG CAGATACTGA 

3501 TGCCGCTCTG GATGCAACCA CCAACGCCTT GAATAAATTG GGAGAAAATA 

3551 TAACGACATT TGCTGAAGAG ACTAAGACAA ATATCGTAAA AATTGATGAA 
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3601 AAATTAGAAG CCGTGGCTGA TACCGTCGAC AAGCATGCCG AAGCATTCAA 

3651 CGATATCGCC GATTCATTGG ATGAAACCAA CACTAAGGCA GACGAAGCCG 

3701 TCAAAACCGC CAATGAAGCC AAACAGACGG CCGAAGAAAC CAAACAAAAC 

3751 GTCGATGCCA AAGTAAAAGC TGCAGAAACT GCAGCAGGCA AAGCCGAAGC 

3801 TGCCGCTGGC ACAGCTAATA CTGCAGCCGA CAAGGCCGAA GCTGTCGCTG 

3851 CAAAAGTTAC CGACATCAAA GCTGATATCG CTACGAACAA AGATAATATT 

3901 GCTAAAAAAG CAAACAGTGC CGACGTGTAC ACCAGAGAAG AGTCTGACAG 

3951 CAAATTTGTC AGAATTGATG GTCTGAACGC TACTACCGAA AAATTGGACA 

4001 CACGCTTGGC TTCTGCTGAA AAATCCATTG CCGATCACGA TACTCGCCTG 

4051 AACGGTTTGG ATAAAACAGT GTCAGACCTG CGCAAAGAAA CCCGCCAAGG 

4101 CCTTGCAGAA CAAGCCGCGC TCTCCGGTCT GTTCCAACCT TACAACGTGG 

4151 GTCTCGAGCA CCACCACCAC CACCACTGA 

1 MTSAPDFNAG GTGIGSNSRA TTAKSAAVSY AGIKNEMCKD RSMLCAGRDD 

51 VAVTDRDAKI NAPPPNLHTG DFPNPNDAYK NLINLKPAIE AGYTGRGVEV 

101 GIVDTGE SVG SISFPELYGR KEHGYNENYK NYTAYMRKEA PEDGGGKDIE 

151 ASFDDEAVIE TEAKPTDIRH VKEIGHIDLV SHIIGGRSVD GRPAGGIAPD 

201 ATLHIMNTND ETKNEMMVAA IRNAWVKLGE RGVRIVNNSF GTTSRAGTAD 

251 LFQIANSEEQ YRQALLDYSG GDKTDEGIRL MQQSDYGNLS YHIRNKNMLF 

301 IFSTGNDAQA QPNTYALLPF YEKDAQKGII TVAGVDRSGE KFKREMYGBP 

351 GTEPLEYGSN HCGITAMWCL SAPYEASVRF TRTNPIQIAG TSFSAPIVTG 

401 TAALLLQKYP WMSNDNLRTT LLTTAQDIGA VGVDSKFGWG LLDAGKAMNG 

451 PASFPFGDFT ADTKGTSDIA YSFRNDISGT GGLIKKGGSQ LQLHGNNTYT 

501 GKTIIEGGSIi VLYGNNKSDM RVETKGALIY NGAASGGSLN SDGIVYLADT 

551 DQSGANETVH IKGSLQLDGK GTLYTRLGKL LKVDGTAIIG GKLYMSARGK 

601 GAGYLNSTGR RVPFLSAAKI GQDYSFFTNI ETDGGLLASL DSVEKTAGSE 

651 GDTLSYYVRR GNAARTASAA AHSAPAGLKH AVEQGGSNIiE NLMVELDASE 

701 SSATPETVET AAADRTDMPG IRPYGATFRA AAAVQHANAA DGVRIFNSLA 

751 ATVYADSTAA HADMQGRRLK AVSDGLDHNG TGLRVIAQTQ QDGGTWEQGG 

801 VEGKMRGSTQ TVGIAAKTGE NTTAAATLGM GRSTWSENSA NAKTDSISLF 

851 AGIRHDAGDI GYLKGLFSYG RYKNSISRST GADEHAEGSV NGTLMQLGAL 

901 GGVWPFAAT GDLTVEGGLR YDLLKQDAFA EKGSALGWSG NSLTEGTLVG 

951 LAGLKLSQPL SDKAVLFATA GVERDLNGRD YTVTGGFTGA TAATGKTGAR 

1001 NMPHTRLVAG LGADVEFGNG WNGIARYSYA GSKQYGNHSG RVGVGYRFLE 

1051 GGGGTGSATN DDDVKKAATV AIAAAYNNGQ EINGFKAGET IYDIDEDGTI 

1101 TKKDATAADV EADDFKGLGL KKWTNLTKT VNENKQNVDA KVKAAESEIE 

1151 KLTTKLADTD AALADTDAAL DATTNALNKL GENITTFAEE TKTNIVKIDE 

1201 KLEAVADTVD KHAEAFNDIA DSLDETNTKA DEAVKTANEA KQTAEETKQN 

1251 VDAKVKAAET AAGKAEAAAG TANTAADKAE AVAAKVTDIK ADIATNKDNI 

1301 AKKANSADVY TREESDSKFV RIDGLNATTE KLDTRLASAE KSIADHDTRL 

1351 NGLDKTVSDL RKETRQGLAE QAALSGLFQP YNVGLEHHHH HH* 

AG741 and hybrids 

Bactericidal titres generated in response to AG741 (His-fusion) were measured against 
various strains, including the homologous 2996 strain: 





2996 


MC58 


NGH38 


F6124 


BZ133 


AG741 


512 


131072 


>2048 


16384 


>2048 



As can be seen, the AG741 -induced anti-bactericidal titre is particularly high against 
heterologous strain MC58. 

AG741 was also fused directly in-frame upstream of proteins 961, 961c, 983 and ORF46.1: 

AQ741-961 

1 ATGGTCGCCG CCGACATCGG TGCGGGGCTT GCCGATGCAC TAACCGCACC 
51 GCTCGACCAT AAAGACAAAG GTTTGCAGTC TTTGACGCTG GATCAGTCCG 
101 TCAGGAAAAA CGAGAAACTG AAGCTGGCGG CACAAGGTGC GGAAAAAACT 
151 TATGGAAACG GTGACAGCCT CAATACGGGC AAATTGAAGA ACGACAAGGT 
201 CAGCCGTTTC GACTTTATCC GCCAAATCGA AGTGGACGGG CAGCTCATTA 
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251 
301 
351 
401 
451 
501 
551 
601 
651 
701 
751 
801 
851 
901 
951 
1001 
1051 
1101 
1151 
1201 
1251 
1301 
1351 
1401 
1451 
1501 
1551 
1601 
1651 
1701 
1751 
1801 
1851 
1901 

1 
51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 
601 



CCTTGGAGAG 
ACCGCCTTTC 
GGTTGCGAAA 
CTTTTGACAA 
TTCGGTTCAG 
CGCCAAGCAG 
ATGTCGACCT 
GTCATCAGCG 
CCTCGGTATC 
TGAAAACCGT 
GAGGGTGGCG 
AGCTGCCACT 
ACGGTTTCAA 
ATTACCAAAA 
AGGTCTGGGT 
AAAACAAACA 
GAAAAGTTAA 
TGATGCCGCT 
ATATAACGAC 
GAAAAATTAG 
CAACGATATC 
CCGTCAAAAC 
AACGTCGATG 
AGCTGCCGCT 
CTGCAAAAGT 
ATTGCTAAAA 
CAGCAAATTT 
ACACACGCTT 
CTGAACGGTT 
AGGCCTTGCA 
TGGGTCGGTT 
GCAGTCGCCA 
AGCAGGCGTG 
TCGGCGTCAA 

MVAADIGAGL 
YGNGDSLNTG 
TAFQTEQIQD 
FGSDDAGGKL 
VISGSVLYNQ 
EGGGGTGSAT 
ITKKDATAAD 
EKLTTKLADT 
EKLEAVADTV 
NVDAKVKAAE 
IAKKANSADV 
LNGLDKTVSD 
AVAIGTGFRF 



AG741-961C 

1 ATGGTCGCCG 

51 GCTCGACCAT 

101 TCAGGAAAAA 

151 TATGGAAACG 

201 CAGCCGTTTC 

251 CCTTGGAGAG 

301 ACCGCCTTTC 

351 GGTTGCGAAA 

401 CTTTTGACAA 

451 TTCGGTTCAG 

501 CGCCAAGCAG 

551 ATGTCGACCT 

601 GTCATCAGCG 

651 CCTCGGTATC 

701 TGAAAACCGT 

751 GAGGGTGGCG 



TGGAGAGTTC 
AGACCGAGCA 
CGCCAGTTCA 
GCTTCCCGAA 
ACGATGCCGG 
GGAAACGGCA 
GGCCGCCGCC 
GTTCCGTCCT 
TTTGGCGGAA 
AAACGGCATA 
GAGGCACTGG 
GTGGCCATTG 
AGCTGGAGAG 
AAGACGCAAC 
CTGAAAAAAG 
AAACGTCGAT 
CAACCAAGTT 
CTGGATGCAA 
ATTTGCTGAA 
AAGCCGTGGC 
GCCGATTCAT 
CGCCAATGAA 
CCAAAGTAAA 
GGCACAGCTA 
TACCGACATC 
AAGCAAACAG 
GTCAGAATTG 
GGCTTCTGCT 
TGGATAAAAC 
GAACAAGCCG 
CAATGTAACG 
TCGGTACCGG 
GCAGTCGGCA 
TTACGAGTGG 

ADALTAPLDH 
KLKNDKVSRF 
SEHSGKMVAK 
TYTIDFAAKQ 
AEKGSYSLGI 
NDDDVKKAAT 
VEADDFKGLG 
DAALADTDAA 
DKHAEAFNDI 
TAAGKAEAAA 
YTREESDSKF 
LRKBTRQGIA 
TENFAAKAGV 



CCGACATCGG 
AAAGACAAAG 
CGAGAAACTG 
GTGACAGCCT 
GACTTTATCC 
TGGAGAGTTC 
AGACCGAGCA 
CGCCAGTTCA 
GCTTCCCGAA 
ACGATGCCGG 
GGAAACGGCA 
GGCCGCCGCC 
GTTCCGTCCT 
TTTGGCGGAA 
AAACGGCATA 
GAGGCACTGG 



CAAGTATACA 
AATACAAGAT 
GAATCGGCGA 
GGCGGCAGGG 
CGGAAAACTG 
AAATCGAACA 
GATATCAAGC 
TTACAACCAA 
AAGCCCAGGA 
CGCCATATCG 
ATCCGCCACA 
CTGCTGCCTA 
ACCATCTACG 
TGCAGCCGAT 
TCGTGACTAA 
GCCAAAGTAA 
AGCAGACACT 
CCACCAACGC 
GAGACTAAGA 
TGATACCGTC 
TGGATGAAAC 
GCCAAACAGA 
AGCTGCAGAA 
ATACTGCAGC 
AAAGCTGATA 
TGCCGACGTG 
ATGGTCTGAA 
GAAAAATCCA 
AGTGTCAGAC 
CGCTCTCCGG 
GCTGCAGTCG 
CTTCCGCTTT 
CTTCGTCCGG 
CTCGAGCACC 

KDKGLQSLTL 
DFIRQIEVDG 
RQFRIGDIAG 
GNGKIEHLKS 
FGGKAQEVAG 
VAIAAAYNNG 
LKKWTNLTK 
LDATTNALNK 
ADSLDETNTK 
GTANTAADKA 
VRIDGLNATT 
EQAALSGLFQ 
AVGTSSGSSA 



TGCGGGGCTT 
GTTTGCAGTC 
AAGCTGGCGG 
CAATACGGGC 
GCCAAATCGA 
CAAGTATACA 
AATACAAGAT 
GAATCGGCGA 
GGCGGCAGGG 
CGGAAAACTG 
AAATCGAACA 
GATATCAAGC 
TTACAACCAA 
AAGCCCAGGA 
CGCCATATCG 
ATCCGCCACA 



AACAAAGCCA 
TCGGAGCATT 
CATAGCGGGC 
CGACATATCG 
ACCTACACCA 
TTTGAAATCG 
CGGATGGAAA 
GCCGAGAAAG 
AGTTGCCGGC 
GCCTTGCCGC 
AACGACGACG 
CAACAATGGC 
ACATTGATGA 
GTTGAAGCCG 
CCTGACCAAA 
AAGCTGCAGA 
GATGCCGCTT 
CTTGAATAAA 
CAAATATCGT 
GACAAGCATG 
CAACACTAAG 
CGGCCGAAGA 
ACTGCAGCAG 
CGACAAGGCC 
TCGCTACGAA 
TACACCAGAG 
CGCTACTACC 
TTGCCGATCA 
CTGCGCAAAG 
TCTGTTCCAA 
GCGGCTACAA 
ACCGAAAACT 
TTCTTCCGCA 
ACCACCACCA 

DQSVRKNEKL 
QLITLESGEF 
EHTSFDKLPE 
PELNVDLAAA 
SAEVKTVNGI 
QEINGFKAGE 
TVNENKQNVD 
LGENITTFAE 
ADEAVKTANE 
EAVAAKVTDI 
EKLDTRLASA 
PYNVGRFNVT 
AYHVGVNYEW 



GCCGATGCAC 
TTTGACGCTG 
CACAAGGTGC 
AAATTGAAGA 
AGTGGACGGG 
AACAAAGCCA 
TCGGAGCATT 
CATAGCGGGC 
CGACATATCG 
ACCTACACCA 
TTTGAAATCG 
CGGATGGAAA 
GCCGAGAAAG 
AGTTGCCGGC 
GCCTTGCCGC 
AACGACGACG 



TTCCGCCTTA 
CCGGGAAGAT 
GAACATACAT 
CGGGACGGCG 
TAGATTTCGC 
CCAGAACTCA 
ACGCCATGCC 
GCAGTTACTC 
AGCGCGGAAG 
CAAGCAACTC 
ATGTTAAAAA 
CAAGAAATCA 
AGACGGCACA 
ACGACTTTAA 
ACCGTCAATG 
ATCTGAAATA 
TAGCAGATAC 
TTGGGAGAAA 
AAAAATTGAT 
CCGAAGCATT 
GCAGACGAAG 
AACCAAACAA 
GCAAAGCCGA 
GAAGCTGTCG 
CAAAGATAAT 
AAGAGTCTGA 
GAAAAATTGG 
CGATACTCGC 
AAACCCGCCA 
CCTTACAACG 
ATCCGAATCG 
TTGCCGCCAA 
GCCTACCATG 
CCACTGA 

KLAAQGAEKT 
QVYKQSHSAL 
GGRATYRGTA 
DIKPDGKRHA 
RHIGLAAKQL 
TIYDIDEDGT 
AKVKAAESEI 
ETKTNXVKID 
AKQTAEETKQ 
KADIATNKDN 
EKSIADHDTR 
AAVGGYKSES 
LEHHHHHH* 



TAACCGCACC 
GATCAGTCCG 
GGAAAAAACT 
ACGACAAGGT 
CAGCTCATTA 
TTCCGCCTTA 
CCGGGAAGAT 
GAACATACAT 
CGGGACGGCG 
TAGATTTCGC 
CCAGAACTCA 
ACGCCATGCC 
GCAGTTACTC 
AGCGCGGAAG 
CAAGCAACTC 
ATGTTAAAAA 
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801 
851 
901 
951 
1001 
1051 
1101 
1151 
1201 
1251 
1301 
1351 
1401 
1451 
1501 
1551 
1601 
1651 
1701 
1751 



AGCTGCCACT 
ACGGTTTCAA 
ATTACCAAAA 
AGGTCTGGGT 
AAAACAAACA 
GAAAAGTTAA 
TGATGCCGCT 
ATATAACGAC 
GAAAAATTAG 
CAACGATATC 
CCGTCAAAAC 
AACGTCGATG 
AGCTGCCGCT 
CTGCAAAAGT 
ATTGCTAAAA 
CAGCAAATTT 
ACACACGCTT 
CTGAACGGTT 
AGGCCTTGCA 
TGGGTCTCGA 



GTGGCCATTG 
AGCTGGAGAG 
AAGACGCAAC 
CTGAAAAAAG 
AAACGTCGAT 
CAACCAAGTT 
CTGGATGCAA 
ATTTGCTGAA 
AAGCCGTGGC 
GCCGATTCAT 
CGCCAATGAA 
CCAAAGTAAA 
GGCACAGCTA 
TACCGACATC 
AAGCAAACAG 
GTCAGAATTG 
GGCTTCTGCT 
TGGATAAAAC 
GAACAAGCCG 
GCACCACCAC 



CTGCTGCCTA 
ACCATCTACG 
TGCAGCCGAT 
TCGTGACTAA 
GCCAAAGTAA 
AGCAGACACT 
CCACCAACGC 
GAGACTAAGA 
TGATACCGTC 
TGGATGAAAC 
GCCAAACAGA 
AGCTGCAGAA 
ATACTGCAGC 
AAAGCTGATA 
TGCCGACGTG 
ATGGTCTGAA 
GAAAAATCCA 
AGTGTCAGAC 
CGCTCTCCGG 
CACCACCACT 



CAACAATGGC 
ACATTGATGA 
GTTGAAGCCG 
CCTGACCAAA 
AAGCTGCAGA 
GATGCCGCTT 
CTTGAATAAA 
CAAATATCGT 
GACAAGCATG 
CAACACTAAG 
CGGCCGAAGA 
ACTGCAGCAG 
CGACAAGGCC 
TCGCTACGAA 
TACACCAGAG 
CGCTACTACC 
TTGCCGATCA 
CTGCGCAAAG 
TCTGTTCCAA 
GA 



CAAGAAATCA 
AGACGGCACA 
ACGACTTTAA 
ACCGTCAATG 
ATCTGAAATA 
TAGCAGATAC 
TTGGGAGAAA 
AAAAATTGAT 
CCGAAGCATT 
GCAGACGAAG 
AACCAAACAA 
GCAAAGCCGA 
GAAGCTGTCG 
CAAAGATAAT 
AAGAGTCTGA 
GAAAAATTGG 
CGATACTCGC 
AAACCCGCCA 
CCTTACAACG 
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i 

51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 



MVAADIGAGL 
YGNGDSLNTG 
TAFQTEQIQD 
FGSDDAGGKL 
VTSGSVLYNQ 
EGGGGTGSAT 
ITKKDATAAD 
EKLTTKLADT 
EKLEAVADTV 
NVDAKVKAAE 
IAKKANSADV 
LNGLDKTVSD 



AQ741-983 



1 
51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 
601 
651 
701 
751 
801 
851 
901 
951 
1001 
1051 
1101 
1151 
1201 
1251 
1301 
1351 
1401 
1451 
1501 



ATGGTCGCCG 
GCTCGACCAT 
TCAGGAAAAA 
TATGGAAACG 
CAGCCGTTTC 
CCTTGGAGAG 
ACCGCCTTTC 
GGTTGCGAAA 
CTTTTGACAA 
TTCGGTTCAG 
CGCCAAGCAG 
ATGTCGACCT 
GTCATCAGCG 
CCTCGGTATC 
TGAAAACCGT 
GAGGGATCCG 
TACCGGTATC 
TATCTTACGC 
TGTGCCGGTC 
TGCCCCCCCC 
CATACAAGAA 
GGACGCGGGG 
CATATCCTTT 
ATTACAAAAA 
GGCGGTAAAG 
TGAAGCAAAG 
ATTTGGTCTC 
GGCGGTATTG 
AACCAAGAAC 
TGGGCGAACG 
AGGGCAGGCA 



ADALTAPLDH 
KLKNDKVSRF 
SEHSGKMVAK 
TYTIDFAAKQ 
AEKGSYSLGI 
NDDDVKKAAT 
VEADDFKGLG 
DAALAOTDAA 
DKHAEAFNDI 
TAAGKAEAAA 
YTREESDSKF 
LRKETRQGLA 



CCGACATCGG 
AAAGACAAAG 
CGAGAAACTG 
GTGACAGCCT 
GACTTTATCC 
TGGAGAGTTC 
AGACCGAGCA 
CGCCAGTTCA 
GCTTCCCGAA 
ACGATGCCGG 
GGAAACGGCA 
GGCCGCCGCC 
GTTCCGTCCT 
TTTGGCGGAA 
AAACGGCATA 
GCGGAGGCGG 
GGCAGCAACA 
CGGTATCAAG 
GGGATGACGT 
CCGAATCTGC 
TTTGATCAAC 
TAGAGGTAGG 
CCCGAACTGT 
CTATACGGCG 
ACATTGAAGC 
CCGACGGATA 
CCATATTATT 
CGCCCGATGC 
GAAATGATGG 
TGGCGTGCGC 
CTGCCGACCT 



KDKGLQSLTL 
DFIRQIEVDG 
RQFRIGDIAG 
GNGKIEHLKS 
FGGKAQEVAG 
VAIAAAYNNG 
LKKWTNLTK 
LDATTNALNK 
ADSLDETNTK 
GTANTAADKA 
VRIDGLNATT 
EQAALSGLFQ 



TGCGGGGCTT 
GTTTGCAGTC 
AAGCTGGCGG 
CAATACGGGC 
GCCAAATCGA 
CAAGTATACA 
AATACAAGAT 
GAATCGGCGA 
GGCGGCAGGG 
CGGAAAACTG 
AAATCGAACA 
GATATCAAGC 
TTACAACCAA 
AAGCCCAGGA 
CGCCATATCG 
CACTTCTGCG 
GCAGAGCAAC 
AACGAAATGT 
TGCGGTTACA 
ATACCGGAGA 
CTCAAACCTG 
TATCGTCGAC 
ATGGCAGAAA 
TATATGCGGA 
TTCTTTCGAC 
TCCGCCACGT 
GGCGGGCGTT 
GACGCTACAC 
TTGCAGCCAT 
ATCGTCAATA 
TTTCCAAATA 



DQSVRKNEKL 
QLITLESGEF 
EHTSFDKLPE 
PELNVDLAAA 
SAEVKTVNGI 
QEINGFKAGE 
TVNENKQNVD 
LGENITTFAE 
ADEAVKTANE 
EAVAAKVTDI 
EKLDTRLASA 
PYNVGLEHHH 



GCCGATGCAC 
TTTGACGCTG 
CACAAGGTGC 
AAATTGAAGA 
AGTGGACGGG 
AACAAAGCCA 
TCGGAGCATT 
CATAGCGGGC 
CGACATATCG 
ACCTACACCA 
TTTGAAATCG 
CGGATGGAAA 
GCCGAGAAAG 
AGTTGCCGGC 
GCCTTGCCGC 
CCCGACTTCA 
AACAGCGAAA 
GCAAAGACAG 
GACAGGGATG 
CTTTCCAAAC 
CAATTGAAGC 
ACAGGCGAAT 
AGAACACGGC 
AGGAAGCGCC 
GATGAGGCCG 
AAAAGAAATC 
CCGTGGACGG 
ATAATGAATA 
CCGCAATGCA 
ACAGTTTTGG 
GCCAATTCGG 



KLAAQGAEKT 
QVYKQSHSAL 
GGRATYRGTA 
DIKPDGKRHA 
RHIGLAAKQIi 
TIYDIDEDGT 
AKVKAAESEI 
ETKTNIVKID 
AKQTAEETKQ 
KADIATNKDN 
EKSIADHDTR 
HHH* 



TAACCGCACC 
GATCAGTCCG 
GGAAAAAACT 
ACGACAAGGT 
CAGCTCATTA 
TTCCGCCTTA 
CCGGGAAGAT 
GAACATACAT 
CGGGACGGCG 
TAGATTTCGC 
CCAGAACTCA 
ACGCCATGCC 
GCAGTTACTC 
AGCGCGGAAG 
CAAGCAACTC 
ATGCAGGCGG 
TCAGCAGCAG 
AAGCATGCTC 
CCAAAATCAA 
CCAAATGACG 
AGGCTATACA 
CCGTCGGCAG 
TATAACGAAA 
TGAAGACGGA 
TTATAGAGAC 
GGACACATCG 
CAGACCTGCA 
CGAATGATGA 
TGGGTCAAGC 
AACAACATCG 
AGGAGCAGTA 
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1551 
1601 
1651 
1701 
1751 
1801 
1851 
1901 
1951 
2001 
2051 
2101 
2151 
2201 
2251 
2301 
2351 
2401 
2451 
2501 
2551 
2601 
2651 
2701 
2751 
2801 
2851 
2901 
2951 
3001 
3051 
3101 
3151 
3201 
3251 
3301 
3351 
3401 
3451 
3501 
3551 
3601 
3651 
3701 
3751 
3801 
3851 
3901 

1 
51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 
601 
651 
701 
751 
801 
851 



CCGCCAAGCG 
TCCGCCTGAT 
AATAAAAACA 
GCCCAACACA 
GCATTATCAC 
GAAATGTATG 
TTGCGGAATT 
TCCGTTTCAC 
GCACCCATCG 
GATGAGCAAC 
TCGGTGCAGT 
GGTAAGGCCA 
CGATACGAAA 
CAGGCACGGG 
GGCAACAACA 
GTTGTACGGC 
TGATTTATAA 
GTCTATCTGG 
CAAAGGCAGT 
GCAAACTGCT 
ATGTCGGCAC 
TGTTCCCTTC 
CAAACATCGA 
AAAACAGCGG 
CAATGCGGCA 
TGAAACACGC 
GAACTGGATG 
GGCAGCCGAC 
TCCGCGCAGC 
ATCTTCAACA 
TGCCGATATG 
ACAACGGCAC 
ACGTGGGAAC 
CGTCGGCATT 
TGGGCATGGG 
GACAGCATTA 
CTATCTCAAA 
GCAGCACCGG 
ATGCAGCTGG 
AGATTTGACG 
CATTCGCCGA 
GAAGGCACGC 
CGATAAAGCC 
GACGCGACTA 
GGCAAGACGG 
GGGCGCGGAT 
GCTACGCCGG 
GGCTACCGGT 

MVAADIGAGL 
YGNGDSLNTG 
TAFQTEQIQD 
FGSDDAGGKL 
VISGSVLYNQ 
EGSGGGGTSA 
CAGRDDVAVT 
GRGVEVGIVD 
GGKDIBASFD 
GGIAPDATLH 
RAGTADLFQI 
NKNMLFIFST 
EMYGEPGTEP 
APIVTGTAAL 
GKAMNGPASF 
GNNTYTGKTI 
VYLADTDQSG 
MSARGKGAGY 



TTGCTCGACT 
GCAACAGAGC 
TGCTTTTCAT 
TATGCCCTAT 
AGTCGCAGGC 
GAGAACCGGG 
ACTGCCATGT 
CCGTACAAAC 
TAACCGGCAC 
GACAACCTGC 
CGGCGTGGAC 
TGAACGGACC 
GGTACATCCG 
CGGCCTGATC 
CCTATACGGG 
AACAACAAAT 
CGGGGCGGCA 
CAGATACCGA 
CTGCAGCTGG 
GAAAGTGGAC 
GCGGCAAGGG 
CTGAGTGCCG 
AACCGACGGC 
GCAGTGAAGG 
CGGACTGCTT 
CGTAGAACAG 
CCTCCGAATC 
CGCACAGATA 
GGCAGCCGTA 
GTCTCGCCGC 
CAGGGACGCC 
GGGTCTGCGC 
AGGGCGGTGT 
GCCGCGAAAA 
ACGCAGCACA 
GTCTGTTTGC 
GGCCTGTTCT 
TGCGGACGAA 
GCGCACTGGG 
GTCGAAGGCG 
AAAAGGCAGT 
TGGTCGGACT 
GTCCTGTTTG 
CACGGTAACG 
GGGCACGCAA 
GTCGAATTCG 
TTCCAAACAG 
TCCTCGAGCA 

ADALTAPLDH 
KLKNDKVSRF 
SEHSGKMVAK 
TYTIDFAAKQ 
AEKGSYSLGI 
PDFNAGGTGI 
DRDAKINAPP 
TGESVGSISF 
DEAVIETEAK 
IMNTNDETKN 
ANSEEQYRQA 
GNDAQAQPNT 
LEYGSNHCGI 
LLQKYPWMSN 
PFGDFTADTK 
IEGGSLVLYG 
ANETVHIKGS 
LNSTGRRVPF 



ATTCCGGCGG 
GATTACGGCA 
CTTTTCGACA 
TGCCATTTTA 
GTAGACCGCA 
TACAGAACCG 
GGTGCCTGTC 
CCGATTCAAA 
GGCGGCTCTG 
GTACCACGTT 
AGCAAGTTCG 
CGCGTCCTTT 
ATATTGCCTA 
AAAAAAGGCG 
CAAAACCATT 
CGGATATGCG 
TCCGGCGGCA 
CCAATCCGGC 
ACGGCAAAGG 
GGTACGGCGA 
GGCAGGCTAT 
CCAAAATCGG 
GGCCTGCTGG 
CGACACGCTG 
CGGCAGCGGC 
GGCGGCAGCA 
ATCCGCAACA 
TGCCGGGCAT 
CAGCATGCGA 
TACCGTCTAT 
GCCTGAAAGC 
GTCATCGCGC 
TGAAGGCAAA 
CCGGCGAAAA 
TGGAGCGAAA 
AGGCATACGG 
CCTACGGACG 
CATGCGGAAG 
CGGTGTCAAC 
GTCTGCGCTA 
GCTTTGGGCT 
CGCGGGTCTG 
CAACGGCGGG 
GGCGGCTTTA 
TATGCCGCAC 
GCAACGGCTG 
TACGGCAACC 
CCACCACCAC 

KDKGLQSLTL 
DFIRQIEVDG 
RQFRIGDIAG 
GNGKIEHLKS 
FGGKAQEVAG 
GSNSRATTAK 
PNLHTGDFPN 
PELYGRKEHG 
PTDIRHVKEI 
EMMVAAIRNA 
LLDYSGGDKT 
YALLPFYEKD 
TAMWCLSAPY 
DNLRTTLLTT 
GTSDIAYSFR 
NNKSDMRVET 
LQLDGKGTLY 
LSAAKIGQDY 



TGATAAAACA 
ACCTGTCCTA 
GGCAATGACG 
TGAAAAAGAC 
GTGGAGAAAA 
CTTGAGTATG 
GGCACCCTAT 
TTGCCGGAAC 
CTGCTGCAGA 
GCTGACGACG 
GCTGGGGACT 
CCGTTCGGCG 
CTCCTTCCGT 
GCAGCCAACT 
ATCGAAGGCG 
CGTCGAAACC 
GCCTGAACAG 
GCAAACGAAA 
TACGCTGTAC 
TTATCGGCGG 
CTCAACAGTA 
GCAGGATTAT 
CTTCCCTCGA 
TCCTATTATG 
ACATTCCGCG 
ATCTGGAAAA 
CCCGAGACGG 
CCGCCCCTAC 
ATGCCGCCGA 
GCCGACAGTA 
CGTATCGGAC 
AAACCCAACA 
ATGCGCGGCA 
TACGACAGCA 
ACAGTGCAAA 
CACGATGCGG 
CTACAAAAAC 
GCAGCGTCAA 
GTTCCGTTTG 
CGACCTGCTC 
GGAGCGGCAA 
AAGCTGTCGC 
CGTGGAACGC 
CCGGCGCGAC 
ACCCGTCTGG 
GAACGGCTTG 
ACAGCGGACG 
CACCACTGA 

DQSVRKNEKL 
QLITLESGEF 
EHTSFDKLPE 
PELNVDLAAA 
SAEVKTVNGI 
SAAVSYAGIK 
PNDAYKNLIN 
YNENYKNYTA 
GHIDLVSHII 
WVKLGERGVR 
DEGIRIiMQQS 
AQKGIITVAG 
EASVRFTRTN 
AQDIGAVGVD 
NDISGTGGLI 
KGALIYNGAA 
TRLGKLLKVD 
SFFTNIETDG 



GACGAGGGTA 
CCACATCCGT 
CACAAGCTCA 
GCTCAAAAAG 
GTTCAAACGG 
GCTCCAACCA 
GAAGCAAGCG 
ATCCTTTTCC 
AATACCCGTG 
GCTCAGGACA 
GCTGGATGCG 
ACTTTACCGC 
AACGACATTT 
GCAACTGCAC 
GTTCGCTGGT 
AAAGGTGCGC 
CGACGGCATT 
CCGTACACAT 
ACACGTTTGG 
CAAGCTGTAC 
CCGGACGACG 
TCTTTCTTCA 
CAGCGTCGAA 
TCCGTCGCGG 
CCCGCCGGTC 
CCTGATGGTC 
TTGAAACTGC 
GGCGCAACTT 
CGGTGTACGC 
CCGCCGCCCA 
GGGTTGGACC 
GGACGGTGGA 
GTACCCAAAC 
GCCGCCACAC 
TGCAAAAACC 
GCGATATCGG 
AGCATCAGCC 
CGGCACGCTG 
CCGCAACGGG 
AAACAGGATG 
CAGCCTCACT 
AACCCTTGAG 
GACCTGAACG 
TGCAGCAACC 
TTGCCGGCCT 
GCACGTTACA 
AGTCGGCGTA 



KLAAQGAEKT 
QVYKQSHSAL 
GGRATYRGTA 
DIKPDGKRHA 
RHIGLAAKQL 
NEMCKDRSML 
LKPAIEAGYT 
YMRKEAPEDG 
GGRSVDGRPA 
XVNNSFGTTS 
DYGNLSYHIR 
VDRSGEKFKR 
PIQIAGTSFS 
SKFGWGLIiDA 
KKGGSQLQLH 
SGGSLNSDGI 
GTAIIGGKLY 
GLLASLDSVE 
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KTAGSEGDTL 
ELDASESSAT 
IFNSLAATVY 
TWEQGGVEGK 
DSISLFAGIR 
MQLGALGGVN 
EGTLVGLAGL 
GKTGARNMPH 
GYRPIiEHHHH 



AG741-ORF46.1 



1 
51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 
601 
651 
701 
751 
801 
851 
901 
951 
1001 
1051 
1101 
1151 
1201 
1251 
1301 
1351 
1401 
1451 
1501 
1551 
1601 
1651 
1701 
1751 
1801 
1851 
1901 
1951 
2001 



ATGGTCGCCG 
GCTCGACCAT 
TCAGGAAAAA 
TATGGAAACG 
CAGCCGTTTC 
CCTTGGAGAG 
ACCGCCTTTC 
GGTTGCGAAA 
CTTTTGACAA 
TTCGGTTCAG 
CGCCAAGCAG 
ATGTCGACCT 
GTCATCAGCG 
CCTCGGTATC 
TGAAAACCGT 
GACGGTGGCG 
CCGGCAGGTT 
TATTCGGCAG 
GGAAAAATAC 
CATTAAAGGA 
AAGTCCATTC 
GCCGGTAGTC 
ATACGAACAC 
ATCCCGCTCC 
GTTGCCCAAA 
ACGGCTTGCC 
TAGGCGACGG 
TCGGGCAATG 
CATCATCGGC 
GCATAAGCGA 
TCCACCGAAA 
ACTCAAAGAC 
CCAATGCCGC 
ATCCCCATCA 
CATCACGGCA 
CGAAAGGGAA 
AAATACCCGT 
GCGTTACGGC 
GCAAAAATGT 
TTTGACGGTA 
GCTCGAGCAC 



SYYVRRGNAA 
PETVETAAAD 
ADSTAAHADM 
MRGSTQTVGI 
HDAGDIGYLK 
VPFAATGDLT 
KLSQPLSDKA 
TRLVAGLGAD 
HH* 



CCGACATCGG 
AAAGACAAAG 
CGAGAAACTG 
GTGACAGCCT 
GACTTTATCC 
TGGAGAGTTC 
AGACCGAGCA 
CGCCAGTTCA 
GCTTCCCGAA 
ACGATGCCGG 
GGAAACGGCA 
GGCCGCCGCC 
GTTCCGTCCT 
TTTGGCGGAA 
AAACGGCATA 
GAGGCACTGG 
CTCGACCGTC 
CAGGGGGGAA 
AAAGCCATCA 
AATATCGGCT 
CCCCTTCGAC 
CCGTTGACGG 
CATCCCGCCG 
CAAAGGCGCG 
ATATCCGCCT 
GACCGTTTCC 
ATTCAAACGC 
CCGCCGAAGC 
GCGGCAGGAG 
AGGCTCAAAC 
ACAAGATGGC 
TATGCCGCAG 
ACAAGGCATA 
AAGGGATTGG 
CATCCTATCA 
ATCCGCCGTC 
CCCCTTACCA 
AAAGAAAACA 
CAAACTGGCA 
AAGGGTTTCC 
CACCACCACC 



RTASAAAHSA 
RTDMPGIRPY 
QGRRLKAVSD 
AAKTGENTTA 
GLFSYGRYKN 
VEGGLRYDLL 
VLFATAGVER 
VEFGNGWNGL 



TGCGGGGCTT 
GTTTGCAGTC 
AAGCTGGCGG 
CAATACGGGC 
GCCAAATCGA 
CAAGTATACA 
AATACAAGAT 
GAATCGGCGA 
GGCGGCAGGG 
CGGAAAACTG 
AAATCGAACA 
GATATCAAGC 
TTACAACCAA 
AAGCCCAGGA 
CGCCATATCG 
ATCCTCAGAT 
AGCATTTCGA 
CTTGCCGAGC 
GTTGGGCAAC 
ACATTGTCCG 
AACCATGCCT 
ATTTAGCCTT 
ACGGCTATGA 
AGGGATATAT 
CAACCTGACC 
ACAATGCCGG 
GCCACCCGAT 
CTTCAACGGC 
AAATTGTCGG 
ATTGCTGTCA 
GCGCATCAAC 
CAGCCATCCG 
GAAGCCGTCA 
AGCTGTTCGG 
AGCGGTCGCA 
AGCGACAATT 
TTCCCGAAAT 
TCACCTCCTC 
GACCAACGCC 
GAATTTTGAG 
ACCACTGA 



PAGLKHAVEQ 
GATFRAAAAV 
GLDHNGTGLR 
AATLGMGRST 
SISRSTGADE 
KQDAFAEKGS 
DLNGRDYTVT 
ARYSYAGSKQ 



GCCGATGCAC 
TTTGACGCTG 
CACAAGGTGC 
AAATTGAAGA 
AGTGGACGGG 
AACAAAGCCA 
TCGGAGCATT 
CATAGCGGGC 
CGACATATCG 
ACCTACACCA 
TTTGAAATCG 
CGGATGGAAA 
GCCGAGAAAG 
AGTTGCCGGC 
GCCTTGCCGC 
TTGGCAAACG 
ACCCGACGGG 
GCAGCGGCCA 
CTGATGATTC 
CTTTTCCGAT 
CACATTCCGA 
TACCGCATCC 
CGGGCCACAG 
ACAGCTACGA 
GACAACCGCA 
TAGTATGCTG 
ACAGCCCCGA 
ACTGCAGATA 
CGCAGGCGAT 
TGCACGGCTT 
GATTTGGCAG 
CGATTGGGCA 
GCAATATCTT 
GGAAAATACG 
GATGGGCGCG 
TTGCCGATGC 
ATCCGTTCAA 
AACCGTGCCG 
ACCCGAAGAC 
AAGCACGTGA 



GGSNLENLMV 
QHANAADGVR 
VIAQTQQDGG 
WSENSANAKT 
HAEGSVNGTL 
ALGWSGNSLT 
GGFTGATAAT 
YGNHSGRVGV 



TAACCGCACC 
GATCAGTCCG 
GGAAAAAACT 
ACGACAAGGT 
CAGCTCATTA 
TTCCGCCTTA 
CCGGGAAGAT 
GAACATACAT 
CGGGACGGCG 
TAGATTTCGC 
CCAGAACTCA 
ACGCCATGCC 
GCAGTTACTC 
AGCGCGGAAG 
CAAGCAACTC 
ATTCTTTTAT 
AAATACCACC 
TATCGGATTG 
AACAGGCGGC 
CACGGGCACG 
TTCTGATGAA 
ATTGGGACGG 
GGCGGCGGCT 
CATAAAAGGC 
GCACCGGACA 
ACGCAAGGAG 
GCTGGACAGA 
TCGTTAAAAA 
GCCGTGCAGG 
GGGTCTGCTT 
ATATGGCGCA 
GTCCAAAACC 
TATGGCAGCC 
GCTTGGGCGG 
ATCGCATTGC 
GGCATACGCC 
ACTTGGAGCA 
CCGTCAAACG 
AGGCGTACCG 
AATATGATAC 
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51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 
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651 



MVAADIGAGL 
YGNGDSLNTG 
TAFQTEQIQD 
FGSDDAGGKL 
VISGSVLYNQ 
DGGGGTGSSD 
GKIQSHQLGN 
AGSPVDGFSIi 
VAQNIRLNLT 
SGNAAEAFNG 
STENKMARIN 
IPIKGIGAVR 
KYPSPYHSRN 
FDGKGFPNFE 



ADALTAPLDH 
KLKNDKVSRF 
SEHSGKMVAK 
TYTIDFAAKQ 
AEKGSYSLGI 
LANDSFIRQV 
LMIQQAAIKG 
YRIHWDGYEH 
DNRSTGQRLA 
TADIYKNIIG 
DLADMAQUCD 
GKYGLGGITA 
IRSNLEQRYG 
KHVKYDTLEH 



KDKGLQSLTL 
DFIRQIEVDG 
RQFRIGDIAG 
GNGKIEHLKS 
FGGKAQEVAG 
LDRQHFEPDG 
NIGYIVRFSD 
HPADGYDGPQ 
DRFHNAGSML 
AAGEIVGAGD 
YAAAAIRDWA 
HPIKRSQMGA 
KENITSSTVP 
HHHHH* 



DQSVRKNEKL 
QLITLESGEF 
EHTSFDKLPE 
PELNVDLAAA 
SAEVKTVNGI 
KYHLFGSRGE 
HGHEVHSPFD 
GGGYPAPKGA 
TQGVGDGFKR 
AVQGISEGSN 
VQNPNAAQGI 
IALPKGKSAV 
PSNGKNVKLA 



KLAAQGAEKT 
QVYKQSHSAL 
GGRATYRGTA 
DIKPDGKRHA 
RHIGLAAKQL 
LAERSGHIGL 
NHASHSDSDE 
RDIYSYDIKG 
ATRYS PELDR 
IAVMHGLGLL 
EAVSNIFMAA 
SDNFADAAYA 
DQRHPKTGVP 
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Example 16 -C-terminal fusions ('hybrids') with 287/AG287 

According to the invention, hybrids of two proteins A & B may be either NHr-A-B-COOH 
or NH2-B-A-COOH. The effect of this difference was investigated using protein 287 either 
C-terminal (in '287-His' form) or N-terminal (in AG287 form - sequences shown above) to 
5 919, 953 and ORF46.1. A panel of strains was used, including homologous strain 2996. FCA 
was used as adjuvant: 





287 & 919 


287 & 953 


287 & ORF46.1 


Strain 


AG28J-919 


919-287 


AG287-953 


953-287 


JG287-4&1 


46.1-287 


2996 


128000 


16000 


65536 


8192 


16384 


8192 


BZ232 


256 


128 


128 


<4 


<4 


<4 


1000 


2048 


<4 


<4 


<4 


<4 


<4 


MC58 


8192 


1024 


16384 


1024 


512 


128 


NGH38 


32000 


2048 


>2048 


4096 


16384 


4096 


394/98 


4096 


32 


256 


128 


128 


16 


MenA (F6124) 


32000 


2048 


>2048 


32 


8192 


1024 


MenC (BZ133) 


64000 


>8192 


>8192 


<16 


8192 


2048 



Better bactericidal titres are generally seen with 287 at the N-tenhinus (in the AG form) 



When fused to protein 961 [NH 2 -AG287-961-COOH - sequence shown above], the resulting 
protein is insoluble and must be denatured and renatured for purification. Following 
10 renaturation, around 50% of the protein was found to remain insoluble. The soluble and 
insoluble proteins were compared, and much better bactericidal titres were obtained with the 
soluble protein (FCA as adjuvant): 





2996 


BZ232 


MC58 


NGH38 


F6124 


BZ133 


Soluble 


65536 


128 


4096 


>2048 


>2048 


4096 


Insoluble 


8192 


<4 


<4 


16 


n.d. 


n.d. 



Titres with the insoluble form were, however, improved by using alum adjuvant instead: 



Insoluble 


32768 


128 


4096 


>2048 


>2048 


2048 



Example 17- N-terminal fusions ('hybrids') to 287 

Expression of protein 287 as full-length with a C-terminal His-tag, or without its leader 
peptide but with a C-terminal His-tag, gives fairly low expression levels. Better expression is 
achieved using a N-terminal GST-fusion. 
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As an alternative to using GST as an N-terminal fusion partner, 287 was placed at the 
C-terminus of protein 919 ('919-287'), of protein 953 ('953-287'), and of proteins ORF46.1 
('ORF46. 1-287'). In both cases, the leader peptides were deleted, and the hybrids were direct 
in-frame fusions. 

5 To generate the 953-287 hybrid, the leader peptides of the two proteins were omitted by 
designing the forward primer downstream from the leader of each sequence; the stop codon 
sequence was omitted in the 953 reverse primer but included in the 287 reverse primer. For 
the 953 gene, the 5' and the 3* primers used for amplification included a Ndel and a BamHl 
restriction sites respectively, whereas for the amplification of the 287 gene the 5' and the 3' 

10 primers included a BaniHI and a Xhol restriction sites respectively. In this way a sequential 
directional cloning of the two genes in pET21b+, using Ndel-BamHI (to clone the first gene) 
and subsequently BamHl-Xhol (to clone the second gene) could be achieved. 

The 919-287 hybrid was obtained by cloning the sequence coding for the mature portion of 
287 into the Xhol site at the 3'-end of the 919-His clone in pET21b+. The primers used for 
15 amplification of the 287 gene were designed for introducing a Sail restriction site at the 5'- 
and a Xhol site at the 3'- of the PCR fragment. Since the cohesive ends produced by the Sail 
and Xhol restriction enzymes are compatible, the 287 PCR product digested with SaH-Xhol 
could be inserted in the pET21b-919 clone cleaved with Xhol 

The ORF46. 1-287 hybrid was obtained similarly. 

20 The bactericidal efficacy (homologous strain) of antibodies raised against the hybrid proteins 
was compared with antibodies raised against simple mixtures of the component antigens: 





Mixture with 287 


Hybrid with 287 


919 


32000 


16000 


953 


8192 


8192 


ORF46.1 


128 


8192 I 



Data for bactericidal activity against heterologous MenB strains and against serotypes A and 
C were also obtained for 919-287 and 953-287: 
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919 


953 


ORF46.1 


Strain 


Mixture 


Hybrid 


Mixture 


Hybrid 


Mixture 


Hybrid 


MC58 


512 


1024 


512 


1024 




1024 


NGH38 


1024 


2048 


2048 


4096 




4096 


BZ232 


512 


128 


1024 


16 






MenA (F6124) 


512 


2048 


2048 


32 




1024 


MenC (Cll) 


>2048 


n.d. 


>2048 


n.d. 




n.d. 


MenC (BZ133) 


>4096 


>8192 


>4096 


<16 




2048 



Hybrids of ORF46.1 and 919 were also constructed. Best results (four-fold higher titre) were 
achieved with 919 at the N-terminus. 

Hybrids 919-5 19His, ORF97-225His and 225-ORF97His were also tested. These gave 
moderate ELISA fitres and bactericidal antibody responses. 

5 Example 18 - the leader peptide from ORF4 

As shown above, the leader peptide of ORF4 can be fused to the mature sequence of other 
proteins (e.g. proteins 287 and 919). It is able to direct lipidation in KcolL 

Example 19 - domains in 564 

The protein '564' is very large (2073aa), and it is difficult to clone and express it in complete 
10 form. To facilitate expression, the protein has been divided into four domains, as shown in 
figure 8 (according to the MC58 sequence): 



Domain 


A 


B 


C 


D 


Amino Acids 


79-360 


361-731 


732-2044 


2045-2073 



These domains show the following homologies: 
•Domain A shows homology to other bacterial toxins: 

gb|AAG03431.l|AE004443_9probable hemagglutinin [Pseudomonas aeruginosa] (38%) 
15 gb|AAC31981.l| (139897) HecA [Pectobacterium chrysanthemi ] (45%) 

emb|CAA36409.l| (X52156) filamentous hemagglutinin [Bordetella pertussis] (31%) 
gb|AAC79757.l| (AF057695) large supernatant proteinl (Haemophilus ducreyi] (26%) 
gb|AAA25657.l| (M30186) HpmA precursor [Proteus mirabilis] (29%) 

20 • Domain B shows no homology, and is specific to 564. 
•Domain C shows homology to: 

gb AAF84995.1|AE004032 HA-like secreted protein [Xylella fastidiosa] (33%) 
gb AAG05850.l|AE004673 hypothetical protein [Pseudomonas aeruginosa] (27%) 
gb AAF68414.1AF237928 putative FHA [Pasteurella multocislda] (23%) 
25 gb AAC79757.1| (AF057695) large supernatant proteinl [Haemophilus ducreyi] (23%) 

pir||S21010 FHA B precursor [Bordetella pertussis] (20%) 
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• Domain D shows homology to other bacterial toxins: 

gb|AAF84995.l|AE004032_14 HA-like secreted protein [Xylella fastidiosa] (29%) 

Using the MC58 strain sequence, good intracellular expression of 564ab was obtained in the 
5 form of GST-fusions (no purification) and his-tagged protein; this domain-pair was also 
expressed as a lipoprotein, which showed moderate expression in the outer membrane/ 
supernatant fraction. 

The b domain showed moderate intracellular expression when expressed as a his-tagged 
product (no purification), and good expression as a GST-fusion. 

10 The c domain showed good intracellular expression as a GST-fusion, but was insoluble. The 
d domain showed moderate intracellular expression as a his-tagged product (no purification). 
The cd protein domain-pair showed moderate intracellular expression (no purification) as a 
GST-fusion, 

Good bactericidal assay titres were observed using the c domain and the be pair. 

1 5 Example 20 - the 919 leader peptide 

The 20mer leader peptide from 919 is discussed in example 1 above: 

MKKYLFRAAL YGIAAAILAA 

As shown in example 1, deletion of this leader improves heterologous expression, as does 
20 substitution with the ORF4 leader peptide. The influence of the 919 leader on expression 
was investigated by fusing the coding sequence to the PhoC reporter gene from Morganella 
morganii [Thaller et al (1994) Microbiology 140:1341-1350]. The construct was cloned in 
the pET21-b plasmid between the Ndel and Xhol sites (Figure 9): 

1 MKKYLFRAAL YGIAAAILAA AIPAGNDATT KPDLYYLKNE QAIDSLKLLP 

25 51 PPPEVGSIQF LNDQAMYEKG RMLRNTERGK QAQADADLAA GGVATAFSGA 

101 FGYPITEKDS PELYKLLTNM IEDAGDLATR SAKEHYMRIR PFAFYGTETC 

151 NTKDQKKLST NGSYPSGHTS IGWATALVLA EVNPANQDAI LERGYQLGQS 

201 RVICGYHWQS DVDAARIVGS AAVATLHSDP AFQAQLAKAK QEFAQKSQK* 

30 The level of expression of PhoC from this plasmid is >200-fold lower than that found for the 
same construct but containing the native PhoC signal peptide. The same result was obtained 
even after substitution of the T7 promoter with the E.coli Plac promoter. This means that the 
influence of the 919 leader sequence on expression does not depend on the promoter used. 

In order to investigate if the results observed were due to some peculiarity of the 919 signal 
35 peptide nucleotide sequence (secondary structure formation, sensitivity to RNAases, etc.) or 
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to protein instability induced by the presence of this signal peptide, a number of mutants 
were generated. The approach used was a substitution of nucleotides of the 919 signal 
peptide sequence by cloning synthetic linkers containing degenerate codons. In this way, 
mutants were obtained with nucleotide and/or amino acid substitutions. 

5 Two different linkers were used, designed to produce mutations in two different regions of 
the 919 signal peptide sequence, in the first 19 base pairs (LI) and between bases 20-36 (SI). 

Ll: 5' T ATG AAa/g TAc/t c/tTN TTt/c a/cGC GCC GCC CTG TAC GGC ATC GCC GCC 
GCC ATC CTC GCC GCC GCG ATC CC 3' 

Sli 5' T ATG AAA AAA TAC CTA TTC CGa/g GCN GCN c/tTa/g TAc/t GGc/g ATC GCC 
10 ' GCC GCC ATC CTC GCC GCC GCG ATC CC 3' 

The alignment of some of the mutants obtained is given below. 

I<1 mutants: 

9Ll-a ATGAAGAAGTACCTTTTCAGCGCCGCC 

15 9Ll-e ATGAAAAAATACTTTTTCCGCGCCGCC 

9Ll-d ATGAAAAAATACTTTTTCCGCGCCGCC— 

9Ll-f ATGAAAAAATATCTCTTTAGCGCCGCCCTGTACGGCATCGCCGCCGCCATCCTCGCCGCC 
919sp ATGAAAAAATACCTATTCCGCGCCGCCCTGTACGGC^TOSCCGCCGCGATCCTCGCCGCC 

20 9Lla MKKYLFSAA 

9Lle MKKYFFRAA 

9Lld MKKYFFRAA 

9Llf MKKYLFSAALYGIAAAILAA 

919sp MKKYLFRAALYGIAAAILAA (i.e. native signal peptide) 

25 

SI mutants: 

9Sl-e ATGAAAAAATACCTATTC ATCGCCGCCGCCATCCTCGCCGCC 

9S1-C ATGAAAAAATACCTATTCCGAGCTGCCCAATACGGCATCGCCGCCGCCATCCTCGCCGCC 

9Sl-b ATGAAAAAATACCTATTCCGGGCCGCCCAATACGGC^TCGCCGCCGCCATCCTCGCCGCC 

30 9S1-1 ATGAAAAAATACCTATTCCGGGCGGCTTTGTACGGGATCGTC^ 

9 1 9 sp ATGAAAAAATACCTATTCCGCGCCGCCCTGTACGGCATCGCCGCCGCCATCCTCGCCGCC 

9Sle MKKYLF IAAAILAA 

9S1C MKKYLFRAAQYG IAAAILAA 

35 9Slb MKKYIiFRAAQYGIAAAILAA 

9Sli MKKYIjFRAALYG IAAAILAA 

919sp MKKYLFRAALYGIAAAILAA 

40 As shown in the sequences alignments, most of the mutants analysed contain in-frame 
deletions which were unexpectedly produced by the host cells. 

Selection of the mutants was performed by transforming K coli BL21(DE3) cells with DNA 
prepared from a mixture of Ll and SI mutated clones. Single transfonnants were screened 
for high PhoC activity by streaking them onto LB plates containing 100 jig/ml ampicillin, 
45 50jig/ml methyl green, 1 mg/ml PDP (phenolphthaleindiphosphate). On this medium PhoC- 
producing cells become green (Figure 10). 
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A quantitative analysis of PhoC produced by these mutants was carried out in liquid medium 
using pNPP as a substrate for PhoC activity. The specific activities measured in cell extracts 
and supernatants of mutants grown in liquid medium for 0, 30, 90, 180 min. were: 

CELL EXTRACTS 



SUPERNATANTS 





0 


30 


90 


180 


control 


0,00 


0,00 


0,00 


0,00 


9phoC 


1,11 


1,11 


3,33 


4,44 


9S1e 


102,12 


111,00 


149,85 


172,05 


9L1a 


206,46 


111,00 


94,35 


83,25 


9L1d 


5,11 


4,77 


4,00 


3,11 


9L1f 


27,75 


94,35 


82,14 


36,63 


9S1b 


156,51 


111,00 


72,15 


28,86 


9S1c 


72,15 


33,30 


21,09 


14,43 


9S1i 


156,51 


83,25 


55,50 


26,64 


phoCwt 


194,25 


180,93 


149,85 


142,08 





0 


30 


90 


180 


control 


0,00 


0,00 


0,00 


0,00 


9phoC 


0,33 


0,00 


0,00 


0,00 


9S1e 


0,11 


0,22 


0,44 


0,89 


9L1a 


4,88 


5,99 


5,99 


7,22 


9L1d 


0,11 


0,11 


0,11 


0,11 


9L1f 


0,11 


0,22 


0,11 


0,11 


9S1b 


1,44 


1,44 


1,44 


1,67 


9S1c 


0,44 


0,78 


0,56 


0,67 


9S1i 


0,22 


0,44 


0,22 


0,78 


phoCwt 


34,41 


43,29 


87,69 


177,60 



Some of the mutants produce high amounts of PhoC and in particular, mutant 9Lla can 
secrete PhoC in the culture medium. This is noteworthy since the signal peptide sequence of 
10 this mutant is only 9 amino acids long. This is the shortest signal peptide described to date. 

Example 21 - C-terminal deletions of Maf-related proteins 
MafB-related proteins include 730, ORF46 and ORF29. 

The 730 protein from MC58 has the following sequence: 



15 



20 



1 VKPLRRLTNL LAACAVAAAA LIQPALAA DL AQDPFITDNA QRQHYEPGGK 

51 YHLFGDPRGS VSDRTGKINV IQDYTHQMGN LLIQQANING TIGYHTRFSG 

101 HGHEEHAPFD NHAADSASEE KGNVDEGFTV YRLNWEGHEH HPADAYDGPK 

151 GGNYPKPTGA RDEYTYHVNG TARSIKLNPT DTRSIRQRIS DNYSNLGSNF 

201 SDRADEANRK MFEHNAKLDR WGNSMEFING VAAGALNPFI SAGEALGIGD 

251 ILYGTRYAID KAAMRNIAPL PAEGKFAVIG GLGSVAGFEK NTREAVDRWI 

301 QENPNAAETV EAVFNVAAAA KVAKLAKAAK PGKAAVSGDF ADSYKKKLAL 
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351 SDSARQLYQN AKYREALDIH YEDLIRRKTD GSSKFINGRE IDAVTNDALI 
401 QAKRTISAID KPKNFLNQKN RKQIKATIEA ANQQGKRAEF WFKYGVHSQV 
451 KSYIESKGGI VKTGLGD* 

5 The leader peptide is underlined. 

730 shows similar features to ORF46 (see example 8 above): 

- as for Orf46, the conservation of the 730 sequence among MenB, MenA and gonococcus 
is high (>80%) only for the N-terminal portion. The C-terminus, from -340, is highly 
divergent. 

10 - its predicted secondary structure contains a hydrophobic segment spanning the central 
region of the molecule (aa. 227-247). 

- expression of the full-length gene in E. coli gives very low yields of protein. Expression 
from tagged or untagged constructs where the signal peptide sequence has been omitted 
has a toxic effect on the host cells. In other words, the presence of the full-length mature 

15 protein in the cytoplasm is highly toxic for the host cell while its translocation to the 
periplasm (mediated by the signal peptide) has no detectable effect on cell viability. This 
"intracellular toxicity" of 730 is particularly high since clones for expression of the 
leaderless 730 can only be obtained at very low frequency using a recA genetic 
background (E. coli strains: HB101 for cloning; HMS174(DE3) for expression). 

20 To overcome this toxicity, a similar approach was used for 730 as described in example 8 for 
ORF46. Four C-terminal truncated forms were obtained, each of which is well expressed. All 
were obtained from intracellular expression of His-tagged leaderless 730. 

Form A consists of the N-terminal hydrophilic region of the mature protein (aa. 28-226). 
This was purified as a soluble His-tagged product, having a higher-than-expected MW. 

25 Form B extends to the end of the region conserved between serogroups (aa. 28-340). This 
was purified as an insoluble His-tagged product. 

The C-terminal truncated forms named CI and C2 were obtained after screening for clones 
expressing high levels of 730-His clones in strain HMS174(DE3). Briefly, the pET21b 
plasmid containing the His-tagged sequence coding for the full-length mature 730 protein 
30 was used to transform the recA strain HMS174(DE3). Transformants were obtained at low 
frequency which showed two phenotypes: large colonies and very small colonies. Several 
large and small colonies were analysed for expression of the 730-His clone. Only cells from 
large colonies over-expressed a protein recognised by anti-730A antibodies. However the 
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protein over-expressed in different clones showed differences in molecular mass. 
Sequencing of two of the clones revealed that in both cases integration of an E. coli IS 
sequence had occurred within the sequence coding for the C terminal region of 730. The two 
integration events have produced in-frame fusion with 1 additional codon in the case of CI, 
5 and 12 additional codons in the case of C2 (Figure 1 1). The resulting "mutant" forms of 730 
have the following sequences: 

730-Cl (due to an IS1 insertion - figure 11A) 

1 MADLAQDPFI TDNAQRQHYE PGGKYHLFGD PRGSVSDRTG KINVIQDYTH 

51 QMGNLLIQQA NINGTIGYHT RFSGHGHEEH APFDNHAADS ASEEKGNVDE 

10 101 GFTVYRLNWE GHEHHPADAY DGPKGGNYPK PTGARDEYTY HVNGTARSIK 

151 LNPTDTRSIR QRISDNYSNL GSNFSDRADE ANRKMFEHNA KLDRWGNSME 

201 FINGVAAGAL NPFISAGEAL GIGDILYGTR YAIDKAAMRN IAPLPAEGKF 

251 AVTGGLGSVA GFEKNTREAV DRWIQENPNA AETVEAVFNV AAAAKVAKLA 

301 KAAKPGKAAV SGDFADSYKK KLALSDSARQ LYQNAKYREA LDIHYBDLIR 

15 351 RKTDGSSKFI NGREIDAVTN DALIQAR* 

The additional amino acid produced by the insertion is underlined. 

730-C2 (due to an IS 5 insertion - Figure 11B) 

1 MADLAQDPFI TDNAQRQHYE PGGKYHLFGD PRGSVSDRTG KINVIQDYTH 
20 51 QMGNLLIQQA NINGTIGYHT RFSGHGHEEH APFDNHAADS ASEEKGNVDE 

101 GFTVYRLNWE GHEHHPADAY DGPKGGNYPK PTGARDEYTY HVNGTARSIK 
151 LNPTDTRSIR QRISDNYSNL GSNFSDRADE ANRKMFEHNA KLDRWGNSME 
201 FINGVAAGAL NPFISAGEAL GIGDILYGTR YAIDKAAMRN IAPLPAEGKF 
251 AVIGGLGSVA GFEKNTREAV DRWIQENPNA AETVEAVFNV AAAAKVAKLA 
25 301 KAAKPGKAAV SGDFADSYKK KLALSDSARQ LYQNAKYREA LGKVRISGEI 

351 LLG * 

The additional amino acids produced by the insertion are underlined. 

In conclusion, intracellular expression of the 730-Cl form gives very high level of protein 
30 and has no toxic effect on the host cells, whereas the presence of the native C-terminus is 
toxic. These data suggest that the "intracellular toxicity" of 730 is associated with the 
C-terminal 65 amino acids of the protein. 

Equivalent truncation of ORF29 to the first 231 or 368 amino acids has been performed, 
using expression with or without the leader peptide (amino acids 1-26; deletion gives 
35 cytoplasmic expression) and with or without a His-tag. 



Example 22 - domains in 961 

As described in example 9 above, the GST-fusion of 961 was the best-expressed in Ecoli 
To improve expression, the protein was divided into domains (figure 12). 

The domains of 961 were designed on the basis of YadA (an adhesin produced by Yersinia 
40 which has been demonstrated to be an adhesin localized on the bacterial surface that forms 
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oligomers that generate surface projection [Hoiczyk et al (2000) EMBO J 19:5989-99]) and 
are: leader peptide, head domain, coiled-coil region (stalk), and membrane anchor domain. 

These domains were expressed with or without the leader peptide, and optionally fused 
either to C-terminal His-tag or to N-terminal GST. Ecoli clones expressing different 
5 domains of 961 were analyzed by SDS-PAGE and western blot for the production and 
localization of the expressed protein, from over-night (o/n) culture or after 3 hours induction 
with IPTG. The results were: 





Total lysate 


Periplasm 


Supernatant 


OMV 




(Western Blot) 


(Western Blot) 


(Western Blot) 


SDS-PAGE 


961 (o/n) 










961 (IPTG) 


+/- 








961-L(o/n) 


+ 






+ 


961-L (IPTG) 


+ 






+ 


961c-L (o/n) 










961c-L(IPTG) 


+ 


+ 


+ 




961Ai-L(o/n) 










961A,-L(IPTG) 


+ 






+ 



The results show that in Exoli: 

■ 961-L is highly expressed and localized on the outer membrane. By western blot analysis 
10 two specific bands have been detected: one at ~45kDa (the predicted molecular weight) and 

one at ~180kDa, indicating that 961-L can form oligomers. Additionally, these aggregates 
are more expressed in the over-night culture (without IPTG induction). OMV preparations of 
this clone were used to immunize mice and serum was obtained. Using overnight culture 
(predominantly by oligomeric form) the serum was bactericidal; the DPTG-induced culture 
1 5 (predominantly monomeric) was not bactericidal. 

■ 961 A r L (with a partial deletion in the anchor region) is highly expressed and localized 
on the outer membrane, but does not form oligomers; 

■ the 961c-L (without the anchor region) is produced in soluble form and exported in the 
supernatant. 

20 Titres in ELIS A and in the serum bactericidal assay using His-fusions were as follows: 





ELISA 


Bactericidal 


961a (aa 24-268) 


24397 


4096 
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961b (aa 269-405) 


7763 


64 


961c-L 


29770 


8192 


961c (2996) 


30774 


>65536 


961c (MC58) 


33437 


16384 


961d 


26069 


>65536 



Kcoli clones expressing different forms of 961 (961, 961-L, 961A r L and 961c-L) were used 
to investigate if the 961 is an adhesin (c.f. YadA). An adhesion assay was performed using 
(a) the human epithelial cells and (b) E.coli clones after either over-night culture or three 
hours IPTG induction. 961-L grown over-night (961A r L) and flTG-induced 961c-L (the 
clones expressing protein on surface) adhere to human epithelial cells. 

961c was also used in hybrid proteins (see above). As 961 and its domain variants direct 
efficient expression, they are ideally suited as the N-terminal portion of a hybrid protein. 



Example 23 -further hybrids 

Further hybrid proteins of the invention are shown below (see also Figure 14). These are 
advantageous when compared to the individual proteins: 

ORT46.1-741 

1 ATGTCAGATT TGGCAAACGA TTCTTTTATC CGGCAGGTTC TCGACCGTCA 

51 GCATTTCGAA CCCGACGGGA AATACCACCT ATTCGGCAGC AGGGGGGAAC 

101 TTGCCGAGCG CAGCGGCCAT ATCGGATTGG GAAAAATACA AAGCCATCAG 

151 TTGGGCAACC TGATGATTCA ACAGGCGGCC ATTAAAGGAA ATATCGGCTA 

201 CATTGTCCGC TTTTCCGATC ACGGGCACGA AGTCCATTCC CCCTTCGACA 

251 ACCATGCCTC ACATTCCGAT TCTGATGAAG CCGGTAGTCC CGTTGACGGA 

301 TTTAGCCTTT ACCGCATCCA TTGGGACGGA TACGAACACC ATCCCGCCGA 

351 CGGCTATGAC GGGCCACAGG GCGGCGGCTA TCCCGCTCCC AAAGGCGCGA 

401 GGGATATATA CAGCTACGAC ATAAAAGGCG TTGCCCAAAA TATCCGCCTC 

451 AACCTGACCG ACAACCGCAG CACCGGACAA CGGCTTGCCG ACCGTTTCCA 

501 CAATGCCGGT AGTATGCTGA CGCAAGGAGT AGGCGACGGA TTCAAACGCG 

551 CCACCCGATA CAGCCCCGAG CTGGACAGAT CGGGCAATGC CGCCGAAGCC 

601 TTCAACGGCA CTGCAGATAT CGTTAAAAAC ATCATCGGCG CGGCAGGAGA 

651 AATTGTCGGC GCAGGCGATG CCGTGCAGGG CATAAGCGAA GGCTCAAACA 

701 TTGCTGTCAT GCACGGCTTG GGTCTGCTTT CCACCGAAAA CAAGATGGCG 

751 CGCATCAACG ATTTGGCAGA TATGGCGCAA CTCAAAGACT ATGCCGCAGC 

801 AGCCATCCGC GATTGGGCAG TCCAAAACCC CAATGCCGCA CAAGGCATAG 

851 AAGCCGTCAG CAATATCTTT ATGGCAGCCA TCCCCATCAA AGGGATTGGA 

901 GCTGTTCGGG GAAAATACGG CTTGGGCGGC ATCACGGCAC ATCCTATCAA 

951 GCGGTCGCAG ATGGGCGCGA TCGCATTGCC GAAAGGGAAA TCCGCCGTCA 

1001 GCGACAATTT TGCCGATGCG GCATACGCCA AATACCCGTC CCCTTACCAT 

1051 TCCCGAAATA TCCGTTCAAA CTTGGAGCAG CGTTACGGCA AAGAAAACAT 

1101 CACCTCCTCA ACCGTGCCGC CGTCAAACGG CAAAAATGTC AAACTGGCAG 

1151 ACCAACGCCA CCCGAAGACA GGCGTACCGT TTGACGGTAA AGGGTTTCCG 

1201 AATTTTGAGA AGCACGTGAA ATATGATACG GGATCCGGAG GGGGTGGTGT 

1251 CGCCGCCGAC ATCGGTGCGG GGCTTGCCGA TGCACTAACC GCACCGCTCG 

1301 ACCATAAAGA CAAAGGTTTG CAGTCTTTGA CGCTGGATCA GTCCGTCAGG 

1351 AAAAACGAGA AACTGAAGCT GGCGGCACAA GGTGCGGAAA AAACTTATGG 

1401 AAACGGTGAC AGCCTCAATA CGGGCAAATT GAAGAACGAC AAGGTCAGCC 

1451 GTTTCGACTT TATCCGCCAA ATCGAAGTGG ACGGGCAGCT CATTACCTTG 

1501 GAGAGTGGAG AGTTCCAAGT, ATACAAACAA AGCCATTCCG CCTTAACCGC 

1551 CTTTCAGACC GAGCAAATAC AAGATTCGGA GCATTCCGGG AAGATGGTTG 

1601 CGAAACGCCA GTTCAGAATC GGCGACATAG CGGGCGAACA TACATCTTTT 
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10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



60 



65 



1651 
1701 
1751 
1801 
1851 
1901 
1951 
2001 

1 
51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 
601 
651 



GACAAGCTTC 
TTCAGACGAT 
AGCAGGGAAA 
GACCTGGCCG 
CAGCGGTTCC 
GTATCTTTGG 
ACCGTAAACG 
CCACCACCAC 

MSDLANDSFI 
LGNLMIQQAA 
FSLYRIHWDG 
NLTDNRSTGQ 
FNGTADIVKN 
RINDLADMAQ 
AVRGKYGLGG 
SRNIRSNLEQ 
NFEKHVKYDT 
KNEKLKLAAQ 
ESGEFQVYKQ 
DKLPEGGRAT 
DLAAADIKPD 
TVNGIRHIGL 



ORF46.1-961 



1 


ATGTCAGATT 


51 


GCATTTCGAA 


101 


TTGCCGAGCG 


151 


TTGGGCAACC 


201 


CATTOTCCGC 


251 


ACCATGCCTC 


301 


TTTAGCCTTT 


351 


CGGCTATGAC 


401 


GGGATATATA 


451 


AACCTGACCG 


501 


CAATGCCGGT 


551 


CCACCCGATA 


601 


TTCAACGGCA 


651 


AATTGTCGGC 


701 


TTGCTGTCAT 


751 


CGCATCAACG 


801 


AGCCATCCGC 


851 


AAGCCGTCAG 


901 


GCTGTTCGGG 


951 


GCGGTCGCAG 


1001 


GCGACAATTT 


1051 


TCCCGAAATA 


1101 


CACCTCCTCA 


1151 


ACCAACGCCA 


1201 


AATTTTGAGA 


1251 


CACAAACGAC 


1301 


CCTACAACAA 


1351 


TACGACATTG 


1401 


CGATGTTGAA 


1451 


CTAACCTGAC 


1501 


GTAAAAGCTG 


1551 


CACTGATGCC 


1601 


ACGCCTTGAA 


1651 


AAGACAAATA 


1701 


CGTCGACAAG 


1751 


AAACCAACAC 


1801 


CAGACGGCCG 


1851 


AGAAACTGCA 


1901 


CAGCCGACAA 


1951 


GATATCGCTA 


2001 


CGTGTACACC 



CCGAAGGCGG 
GCCGGCGGAA 
CGGCAAAATC 
CCGCCGATAT 
GTCCTTTACA 
CGGAAAAGCC 
GCATACGCCA 
CACCACTGA 

RQVLDRQHFE 
IKGNIGYIVR 
YEHHPADGYD 
RLADRFHNAG 
IIGAAGEIVG 
LKDYAAAAIR 
ITAHPIKRSQ 
RYGKENITSS 
GSGGGGVAAD 
GAEKTYGNGD 
SHSALTAFQT 
YRGTAFGSDD 
GKRHAVISGS 
AAKQLKHHHH 



TGGCAAACG A 
CCCGACGGGA 
CAGCGGCCAT 
TGATGATTCA 
TTTTCCGATC 
ACATTCCGAT 
ACCGCATCCA 
GGGCCACAGG 
CAGCTACGAC 
ACAACCGCAG 
AGTATGCTGA 
CAGCCCCGAG 
CTGCAGATAT 
GCAGGCGATG 
GCACGGCTTG 
ATTTGGCAGA 
GATTGGGCAG 
CAATATCTTT 
GAAAATACGG 
ATGGGCGCGA 
TGCCGATGCG 
TCCGTTCAAA 
ACCGTGCCGC 
CCCGAAGACA 
AGCACGTGAA 
GACGATGTTA 
TGGCCAAGAA 
ATGAAGACGG 
GCCGACGACT 
CAAAACCGTC 
CAGAATCTGA 
GCTTTAGCAG 
TAAATTGGGA 
TCGTAAAAAT 
CATGCCGAAG 
TAAGGCAGAC 
AAGAAACCAA 
GCAGGCAAAG 
GGCCGAAGCT 
CGAACAAAGA 
AGAGAAGAGT 



CAGGGCGACA 
AACTGACCTA 
GAACATTTGA 
CAAGCCGGAT 
ACCAAGCCGA 
CAGGAAGTTG 
TATCGGCCTT 



PDGKYHLFGS 
FSDHGHEVHS 
GPQGGGYPAP 
SMLTQGVGDG 
AGDAVQGISE 
DWAVQNPNAA 
MGAIALPKGK 
TVPPSNGKNV 
IGAGLADALT 
SLNTGKLKND 
EQIQDSEHSG 
AGGKLTYTID 
VLYNQAEKGS 
HH* 



TATCGCGGGA 
CACCATAGAT 
AATCGCCAGA 
GGAAAACGCC 
GAAAGGCAGT 
CCGGCAGCGC 
GCCGCCAAGC 



RGELAERSGH 
PFDNHASHSD 
KGARDIYSYD 
FKRATRYSPE 
GSNIAVMHGL 
QGIEAVSNIF 
SAVSDNFADA 
KLADQRHPKT 
APLDHKDKGL 
KVSRFDFIRQ 
KMVAKRQFRI 
FAAKQGNGKI 
YSLGIFGGKA 



TTCTTTTATC 
AATACCACCT 
ATCGGATTGG 
ACAGGCGGCC 
ACGGGCACGA 
TCTGATGAAG 
TTGGGACGGA 
GCGGCGGCTA 
ATAAAAGGCG 
CACCGGACAA 
CGCAAGGAGT 
CTGGACAGAT 
CGTTAAAAAC 
CCGTGCAGGG 
GGTCTGCTTT 
TATGGCGCAA 
TCCAAAACCC 
ATGGCAGCCA 
CTTGGGCGGC 
TCGCATTGCC 
GCATACGCCA 
CTTGGAGCAG 
CGTCAAACGG 
GGCGTACCGT 
ATATGATACG 
AAAAAGCTGC 
ATCAACGGTT 
CACAATTACC 
TTAAAGGTCT 
AATGAAAACA 
AATAGAAAAG 
ATACTGATGC 
GAAAATATAA 
TGATGAAAAA 
CATTCAACGA 
GAAGCCGTCA 
ACAAAACGTC 
CCGAAGCTGC 
GTCGCTGCAA 
TAATATTGCT 
CTGACAGCAA 



CGGCAGGTTC 
ATTCGGCAGC 
GAAAAATACA 
ATTAAAGGAA 

agtccattcc 
ccggtagtcc 
tacgaacacc 
tcccgctccc 
ttgcccaaaa 
cggcttgccg 
aggcgacgga 
cgggcaatgc 
atcatcggcg 
cataagcgaa 
ccaccgaaaa 
ctcaaagact 
caatgccgca 
tccccatcaa 
atcacggcac 
gaaagggaaa 
aatacccgtc 
cgttacggca 

CAAAAATGTC 
TTGACGGTAA 
GGATCCGGAG 
CACTGTGGCC 
TCAAAGCTGG 
AAAAAAGACG 
GGGTCTGAAA 
AACAAAACGT 
TTAACAACCA 
CGCTCTGGAT 
CGACATTTGC 
TTAGAAGCCG 
TATCGCCGAT 
AAACCGCCAA 
GATGCCAAAG 
CGCTGGCACA 
AAGTTACCGA 
AAAAAAGCAA 
ATTTGTCAGA 



CGGCGTTCGG 
TTCGCCGCCA 
ACTCAATGTC 
ATGCCGTCAT 
TACTCCCTCG 
GGAAGTGAAA 
AACTCGAGCA 



IGLGKIQSHQ 
SDEAGSPVDG 
IKGVAQNIRL 
LDRSGNAAEA 
GLLSTENKMA 
MAAIPIKGIG 
AYAKYPSPYH 
GVPFDGKGFP 
QSLTLDQSVR 
IEVDGQLITL 
GDIAGEHTSF 
EHLKSPELNV 
QEVAGSAEVK 



TCGACCGTCA 
AGGGGGGAAC 
AAGCCATCAG 
ATATCGGCTA 
CCCTTCGACA 
CGTTGACGGA 
ATCCCGCCGA 
AAAGGCGCGA 
TATCCGCCTC 
ACCGTTTCCA 
TTCAAACGCG 
CGCCGAAGCC 
CGGCAGGAGA 
GGCTCAAACA 
CAAGATGGCG 
ATGCCGCAGC 
CAAGGCATAG 
AGGGATTGGA 
ATCCTATCAA 
TCCGCCGTCA 
CCCTTACCAT 
AAGAAAACAT 
AAACTGGCAG 
AGGGTTTCCG 
GAGGAGGAGC 
ATTGCTGCTG 
AGAGACCATC 
CAACTGCAGC 
AAAGTCGTGA 
CGATGCCAAA 
AGTTAGCAGA 
GCAACCACCA 
TGAAGAGACT 
TGGCTGATAC 
TCATTGGATG 
TGAAGCCAAA 
TAAAAGCTGC 
GCTAATACTG 
CATCAAAGCT 
ACAGTGCCGA 
ATTGATGGTC 
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51 
101 
151 
201 
251 
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351 
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451 
501 
551 
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651 
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751 
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TGAACGCTAC 
TCCATTGCCG 
AGACCTGCGC 
CCGGTCTGTT 
GTCGGCGGCT 
CTTTACCGAA 
CCGGTTCTTC 
CACCACCACC 

MSDLANDSFI 
LGNLMIQQAA 
FSLYRIHWDG 
NLTDNRSTGQ 
FNGTADXVKN 
RINDLADMAQ 
AVRGKYGLGG 
SRNIRSNLEQ 
NFEKHVKYDT 
YDIDEDGTIT 
VKAAESEIEK 
KTNIVKIDEK 
QTAEETKQNV 
DIATNKDNIA 
SIADHDTRLN 
VGGYKSESAV 
HHHHHH* 



TACCGAAAAA 
ATCACGATAC 
AAAGAAACCC 
CCAACCTTAC 
ACAAATCCGA 
AACTTTGCCG 
CGCAGCCTAC 
ACCACCACTG 

RQVLDRQHFE 
IKGNIGYIVR 
YEHHPADGYD 
RLADRFHNAG 
IIGAAGEIVG 
LKDYAAAAIR 
ITAHPIKRSQ 
RYGKENITSS 
GSGGGGATND 
KKDATAADVE 
LTTKLADTDA 
LEAVADTVDK 
DAKVKAAETA 
KKANSADVYT 
GLDKTVSDLR 
AIGTGFRFTE 



TTGGACACAC 
TCGCCTGAAC 
GCCAAGGCCT 
AACGTGGGTC 
ATCGGCAGTC 
CCAAAGCAGG 
CATGTCGGCG 
A 

PDGKYHLFGS 
FSDHGHEVHS 
GPQGGGYPAP 
SMLTQGVGDG 
AGDAVQGISE 
DWAVQNPNAA 
MGAIALPKGK 
TVPPSNGKNV 
DDVKKAATVA 
ADDFKGLGLK 
ALADTDAALD 
HAEAFNDIAD 
AGKAEAAAGT 
REESDSKFVR 
KETRQGI»AEQ 
NFAAKAGVAV 



GCTTGGCTTC 
GGTTTGGATA 
TGCAGAACAA 
GGTTCAATGT 
GCCATCGGTA 
CGTGGCAGTC 
TCAATTACGA 



RGELAERSGH 
PFDNHASHSD 
KGARDIYSYD 
FKRATRYSPE 
GSNIAVMHGL 
QGIEAVSNIF 
SAVSDNFADA 
KLADQRHPKT 
IAAAYNNGQE 
KWTNLTKTV 
ATTNALNKLG 
SLDETNTKAD 
ANTAADKAEA 
IDGIiNATTEK 
AALSGLFQPY 
GTSSGSSAAY 



TGCTGAAAAA 
AAACAGTGTC 
GCCGCGCTCT 
AACGGCTGCA 
CCGGCTTCCG 
GGCACTTCGT 
GTGGCTCGAG 



IGLGKIQSHQ 
SDEAGSPVDG 
IKGVAQNIRL 
LDRSGNAAEA 
GLLSTENKMA 
MAAIPIKGIG 
AYAKYPSPYH 
GVPFDGKGFP 
INGFKAGETI 
NENKQNVDAK 
ENITTFAEET 
EAVKTANEAK 
VAAKVTDIKA 
LDTRLASAEK 
NVGRFNVTAA 
HVGVNYEWLE 
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ORF46.1-961C 



1 
51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 
601 
651 
701 
751 
801 
851 
901 
951 
1001 
1051 
1101 
1151 
1201 
1251 
1301 
1351 
1401 
1451 
1501 
1551 
1601 
1651 
1701 
1751 
1801 
1851 



ATGTCAGATT 
GCATTTCGAA 
TTGCCGAGCG 
TTGGGCAACC 
CATTGTCCGC 
ACCATGCCTC 
TTTAGCCTTT 
CGGCTATGAC 
GGGATATATA 
AACCTGACCG 
CAATGCCGGT 
CCACCCGATA 
TTCAACGGCA 
AATTGTCGGC 
TTGCTGTCAT 
CGCATCAACG 
AGCCATCCGC 
AAGCCGTCAG 
GCTGTTCGGG 
GCGGTCGCAG 
GCGACAATTT 
TCCCGAAATA 
CACCTCCTCA 
ACCAACGCCA 
AATTTTGAGA 
CACAAACGAC 
CCTACAACAA 
TACGACATTG 
CGATGTTGAA 
CTAACCTGAC 
GTAAAAGCTG 
CACTGATGCC 
ACGCCTTGAA 
AAGACAAATA 
CGTCGACAAG 
AAACCAACAC 
CAGACGGCCG 
AGAAACTGCA 



TGGCAAACGA 
CCCGACGGGA 
CAGCGGCCAT 
TGATGATTCA 
TTTTCCGATC 
ACATTCCGAT 
ACCGCATCCA 
GGGCCACAGG 
CAGCTACGAC 
ACAACCGCAG 
AGTATGCTGA 
CAGCCCCGAG 
CTGCAGATAT 
GCAGGCGATG 
GCACGGCTTG 
ATTTGGCAGA 
GATTGGGCAG 
CAATATCTTT 
GAAAATACGG 
ATGGGCGCGA 
TGCCGATGCG 
TCCGTTCAAA 
ACCGTGCCGC 
CCCGAAGACA 
AGCACGTGAA 
GACGATGTTA 
TGGCCAAGAA 
ATGAAGACGG 
GCCGACGACT 
CAAAACCGTC 
CAGAATCTGA 
GCTTTAGCAG 
TAAATTGGGA 
TCGTAAAAAT 
CATGCCGAAG 
TAAGGCAGAC 
AAGAAACCAA 
GCAGGCAAAG 



TTCTTTTATC 
AATACCACCT 
ATCGGATTGG 
ACAGGCGGCC 
ACGGGCACGA 
TCTGATGAAG 
TTGGGACGGA 
GCGGCGGCTA 
ATAAAAGGCG 
CACCGGACAA 
CGCAAGGAGT 
CTGGACAGAT 
CGTTAAAAAC 
CCGTGCAGGG 
GGTCTGCTTT 
TATGGCGCAA 
TCCAAAACCC 
ATGGCAGCCA 
CTTGGGCGGC 
TCGCATTGCC 
GCATACGCCA 
CTTGGAGCAG 
CGTCAAACGG 
GGCGTACCGT 
ATATGATACG 
AAAAAGCTGC 
ATCAACGGTT 
CACAATTACC 
TTAAAGGTCT 
AATGAAAACA 
AATAGAAAAG 
ATACTGATGC 
GAAAATATAA 
TGATGAAAAA 
CATTCAACGA 
GAAGCCGTCA 
ACAAAACGTC 
CCGAAGCTGC 



CGGCAGGTTC 
ATTCGGCAGC 
GAAAAATACA 
ATTAAAGGAA 
AGTCCATTCC 
CCGGTAGTCC 
TACGAACACC 
TCCCGCTCCC 
TTGCCCAAAA 
CGGCTTGCCG 
AGGCGACGGA 
CGGGCAATGC 
ATCATCGGCG 
CATAAGCGAA 
CCACCGAAAA 
CTCAAAGACT 
CAATGCCGCA 
TCCCCATCAA 
ATCACGGCAC 
GAAAGGGAAA 
AATACCCGTC 
CGTTACGGCA 
CAAAAATGTC 
TTGACGGTAA 
GGATCCGGAG 
CACTGTGGCC 
TCAAAGCTGG 
AAAAAAGACG 
GGGTCTGAAA 
AACAAAACGT 
TTAACAACCA 
CGCTCTGGAT 
CGACATTTGC 
TTAGAAGCCG 
TATCGCCGAT 
AAACCGCCAA 
GATGCCAAAG 
CGCTGGCACA 



TCGACCGTCA 
AGGGGGGAAC 
AAGCCATCAG 
ATATCGGCTA 
CCCTTCGACA 
CGTTGACGGA 
ATCCCGCCGA 
AAAGGCGCGA 
TATCCGCCTC 
ACCGTTTCCA 
TTCAAACGCG 
CGCCGAAGCC 
CGGCAGGAGA 
GGCTCAAACA 
CAAGATGGCG 
ATGCCGCAGC 
CAAGGCATAG 
AGGGATTGGA 
ATCCTATCAA 
TCCGCCGTCA 
CCCTTACCAT 
AAGAAAACAT 
AAACTGGCAG 
AGGGTTTCCG 
GAGGAGGAGC 
ATTGCTGCTG 
AGAGACCATC 
CAACTGCAGC 
AAAGTCGTGA 
CGATGCCAAA 
AGTTAGCAGA 
GCAACCACCA 
TGAAGAGACT 
TGGCTGATAC 
TCATTGGATG 
TGAAGCCAAA 
TAAAAGCTGC 
GCTAATACTG 
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1901 CAGCCGACAA GGCCGAAGCT GTCGCTGCAA AAGTTACCGA CATCAAAGCT 

1951 GATATCGCTA CGAACAAAGA TAATATTGCT AAAAAAGCAA ACAGTGCCGA 

2001 CGTGTACACC AGAGAAGAGT CTGACAGCAA ATTTGTCAGA ATTGATGGTC 

2051 TGAACGCTAC TACCGAAAAA TTGGACACAC GCTTGGCTTC TGCTGAAAAA 

2101 TCCATTGCCG ATCACGATAC TCGCCTGAAC GGTTTGGATA AAACAGTGTC 

2151 AGACCTGCGC AAAGAAACCC GCCAAGGCCT TGCAGAACAA GCCGCGCTCT 

2201 CCGGTCTGTT CCAACCTTAC AACGTGGGTC TCGAGCACCA CCACCACCAC 

2251 CACTGA 

1 MSDLANDSFI RQVLDRQHFE PDGKYHLFGS RGELAERSGH IGLGKIQSHQ 

51 LGNLMIQQAA IKGNIGYIVR FSDHGHEVHS PFDNHASHSD SDEAGSPVDG 

101 FSLYRIHWDG YEHHPADGYD GPQGGGYPAP KGARDIYSYD IKGVAQNIRL 

151 NLTDNRSTGQ RLADRFHNAG SMLTQGVGDG FKRATRYSPE LDRSGNAAEA 

201 FNGTADIVKN IIGAAGEIVG AGDAVQGISE GSNIAVMHGL GLLSTENKMA 

251 RINDLADMAQ LKDYAAAAIR DWAVQNPNAA QGIEAVSNIF MAAIPIKGIG 

301 AVRGKYGLGG ITAHPIKRSQ MGAIALPKGK SAVSDNFADA AYAKYPSPYH 

351 SRNIRSNLEQ RYGKENITSS TVPPSNGKNV KLADQRHPKT GVPFDGKGFP 

401 NFEKHVKYDT GSGGGGATND DDVKKAATVA IAAAYNNGQE INGFKAGETI 

451 YDIDEDGTIT KKDATAADVE ADDFKGLGLK KWTNLTKTV NENKQNVDAK 

501 VKAAESEIEK LTTKLADTDA ALADTDAALD ATTNALNKLG ENITTFAEET 

551 KTNIVKIDEK LEAVADTVDK HAEAFNDIAD SLDETNTKAD EAVKTANEAK 

601 QTAEETKQNV DAKVKAAETA AGKAEAAAGT ANTAADKAEA VAAKVTDIKA 

651 DIATNKDNIA KKANSADVYT REESDSKFVR IDGLNATTEK LDTRLASAEK 

701 SIADHDTRLN GLDKTVSDLR KETRQGLAEQ AALSGLFQPY NVGLEHHHHH 

751 H* 



961-ORF46.1 

1 ATGGCCACAA ACGACGACGA TGTTAAAAAA GCTGCCACTG TGGCCATTGC 

51 TGCTGCCTAC AACAATGGCC AAGAAATCAA CGGTTTCAAA GCTGGAGAGA 

101 CCATCTACGA CATTGATGAA GACGGCACAA TTACCAAAAA AGACGCAACT 

151 GCAGCCGATG TTGAAGCCGA CGACTTTAAA GGTCTGGGTC TGAAAAAAGT 

201 CGTGACTAAC CTGACCAAAA CCGTCAATGA AAACAAACAA AACGTCGATG 

251 CCAAAGTAAA AGCTGCAGAA TCTGAAATAG AAAAGTTAAC AACCAAGTTA ' 

301 GCAGACACTG ATGCCGCTTT AGCAGATACT GATGCCGCTC TGGATGCAAC 

351 CACCAACGCC TTGAATAAAT TGGGAGAAAA TATAACGACA TTTGCTGAAG 

401 AGACTAAGAC AAATATCGTA AAAATTGATG AAAAATTAGA AGCCGTGGCT 

451 GATACCGTCG ACAAGCATGC CGAAGCATTC AACGATATCG CCGATTCATT 

501 GGATGAAACC AACACTAAGG CAGACGAAGC CGTCAAAACC GCCAATGAAG 

551 CCAAACAGAC GGCCGAAGAA ACCAAACAAA ACGTCGATGC CAAAGTAAAA 

601 GCTGCAGAAA CTGCAGCAGG CAAAGCCGAA GCTGCCGCTG GCACAGCTAA 

651 TACTGCAGCC GACAAGGCCG AAGCTGTCGC TGCAAAAGTT ACCGACATCA 

701 AAGCTGATAT CGCTACGAAC AAAGATAATA TTGCTAAAAA AGCAAACAGT 

751 GCCGACGTGT ACACCAGAGA AGAGTCTGAC AGCAAATTTG TCAGAATTGA 

801 TGGTCTGAAC GCTACTACCG AAAAATTGGA CACACGCTTG GCTTCTGCTG 

851 AAAAATCCAT TGCCGATCAC GATACTCGCC TGAACGGTTT GGATAAAACA 

901 GTGTCAGACC TGCGCAAAGA AACCCGCCAA GGCCTTGCAG AACAAGCCGC 

951 GCTCTCCGGT CTGTTCCAAC CTTACAACGT GGGTCGGTTC AATGTAACGG 

1001 CTGCAGTCGG CGGCTACAAA TCCGAATCGG CAGTCGCCAT CGGTACCGGC 

1051 TTCCGCTTTA CCGAAAACTT TGCCGCCAAA GCAGGCGTGG CAGTCGGCAC 

1101 TTCGTCCGGT TCTTCCGCAG CCTACCATGT CGGCGTCAAT TACGAGTGGG 

1151 GATCCGGAGG AGGAGGATCA GATTTGGCAA ACGATTCTTT TATCCGGCAG 

1201 GTTCTCGACC GTCAGCATTT CGAACCCGAC GGGAAATACC ACCTATTCGG 

1251 CAGCAGGGGG GAACTTGCCG AGCGCAGCGG CCATATCGGA TTGGGAAAAA 

1301 TACAAAGCCA TCAGTTGGGC AACCTGATGA TTCAACAGGC GGCCATTAAA 

1351 GGAAATATCG GCTACATTGT CCGCTTTTCC GATCACGGGC ACGAAGTCCA 

1401 TTCCCCCTTC GACAACCATG CCTCACATTC CGATTCTGAT GAAGCCGGTA 

1451 GTCCCGTTGA CGGATTTAGC CTTTACCGCA TCCATTGGGA CGGATACGAA 

1501 CACCATCCCG CCGACGGCTA TGACGGGCCA CAGGGCGGCG GCTATCCCGC 

1551 TCCCAAAGGC GCGAGGGATA TATACAGCTA CGACATAAAA GGCGTTGCCC 

1601 AAAATATCCG CCTCAACCTG ACCGACAACC GCAGCACCGG ACAACGGCTT 

1651 GCCGACCGTT TCCACAATGC CGGTAGTATG CTGACGCAAG GAGTAGGCGA 

1701 CGGATTCAAA CGCGCCACCC GATACAGCCC CGAGCTGGAC AGATCGGGCA 

1751 ATGCCGCCGA AGCCTTCAAC GGCACTGCAG ATATCGTTAA AAACATCATC 

1801 GGCGCGGCAG GAGAAATTGT CGGCGCAGGC GATGCCGTGC AGGGCATAAG 

1851 CGAAGGCTCA AACATTGCTG TCATGCACGG CTTGGGTCTG CTTTCCACCG 

1901 AAAACAAGAT GGCGCGCATC AACGATTTGG CAGATATGGC GCAACTCAAA 
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10 



1951 
2001 
2051 
2101 
2151 
2201 
2251 
2301 
2351 
2401 



GACTATGCCG 
CGCACAAGGC 
TCAAAGGGAT 
GCACATCCTA 
GAAATCCGCC 
CGTCCCCTTA 
GGCAAAGAAA 
TGTCAAACTG 
GTAAAGGGTT 
CACCACCACC 



CAGCAGCCAT 
ATAGAAGCCG 
TGGAGCTGTT 
TCAAGCGGTC 
GTCAGCGACA 
CCATTCCCGA 
ACATCACCTC 
GCAGACCAAC 
TCCGAATTTT 
ACCACCACTG 



CCGCGATTGG 
TCAGCAATAT 
CGGGGAAAAT 
GCAGATGGGC 
ATTTTGCCGA 
AATATCCGTT 
CTCAACCGTG 
GCCACCCGAA 
GAGAAGCACG 
A 



GCAGTCCAAA 
CTTTATGGCA 
ACGGCTTGGG 
GCGATCGCAT 
TGCGGCATAC 
CAAACTTGGA 
CCGCCGTCAA 
GACAGGCGTA 
TGAAATATGA 



ACCCCAATGC 
GCCATCCCCA 
CGGCATCACG 
TGCCGAAAGG 
GCCAAATACC 
GCAGCGTTAC 
ACGGCAAAAA 
CCGTTTGACG 
TACGCTCGAG 
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l 

51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 
601 
651 
701 
751 
801 

961-741 

1 
51 

101 

151 

201 

251 

301 

351 

401 

451 

501 

551 

601 

651 

701 

751 

801 

851 

901 

951 
1001 
1051 
1101 
1151 
1201 
1251 
1301 
1351 
1401 
1451 
1501 
1551 
1601 
1651 
1701 
1751 
1801 



MATNDDDVKK 
AADVEADDFK 
ADTDAALADT 
OTVDKHABAF 
AAETAAGKAE 
ADVYTREESD 
VSDLRKETRQ 
FRFTENFAAK 
VLDRQHFEPD 
GNIGYIVRFS 
HHPADGYDGP 
ADRFHNAGSM 
GAAGEIVGAG 
DYAAAAIRDW 
AHPIKRSQMG 
GKENITSSTV 
HHHHHH* 



AATVAIAAAY 
GLGLKKWTN 
DAALDATTNA 
NDIADSLDET 
AAAGTANTAA 
SKFVRIDGLN 
GLAEQAALSG 
AGVAVGTSSG 
GKYHLFGSRG 
DHGHEVHSPF 
QGGGYPAPKG 
LTQGVGDGFK 
DAVQGISEGS 
AVQNPNAAQG 
AIALPKGKSA 
PPSNGKNVKL 



NNGQEINGFK 
LTKTVNENKQ 
LNKLGENITT 
NTKADEAVKT 
DKAEAVAAKV 
ATTEKLDTRL 
LFQPYNVGRF 
SSAAYHVGVN 
ELAERSGHIG 
DNHASHSDSD 
ARDIYSYDIK 
RATRYSPELD 
NIAVMHGLGL 
IEAVSNIFMA 
VSDNFADAAY 
ADQRHPKTGV 



AGETIYDIDE 
NVDAKVKAAE 
FAEETKTNIV 
ANEAKQTAEE 
TDIKADIATN 
ASAEKSIADH 
NVTAAVGGYK 
YEWGSGGGGS 
LGKIQSHQLG 
EAGSPVDGFS 
GVAQNIRLNL 
RSGNAAEAFN 
LSTENKMARI 
AIPIKGIGAV 
AKYPSPYHSR 
PFDGKGFPNF 



DGTITKKDAT 
SEIEKLTTKL 
KIDEKLEAVA 
TKQNVDAKVK 
KDNIAKKANS 
DTRLNGLDKT 
SESAVAIGTG 
DLANDSFIRQ 
NLMIQQAAIK 
LYRIHWDGYE 
TDNRSTGQRL 
GTADIVKNII 
NDLADMAQLK 
RGKYGLGGIT 
NIRSNLEQRY 
EKHVKYDTLE 



ATGGCCACAA 
TGCTGCCTAC 
CCATCTACGA 
GCAGCCGATG 
CGTGACTAAC 
CCAAAGTAAA 
GCAGACACTG 
CACCAACGCC 
AGACTAAGAC 
GATACCGTCG 
GGATGAAACC 
CCAAACAGAC 
GCTGCAGAAA 
TACTGCAGCC 
AAGCTGATAT 
GCCGACGTGT 
TGGTCTGAAC 
AAAAATCCAT 
GTGTCAGACC 
GCTCTCCGGT 
CTGCAGTCGG 
TTCCGCTTTA 
TTCGTCCGGT 
GATCCGGAGG 
GCACTAACCG 
GCTGGATCAG 
GTGCGGAAAA 
AAGAACGACA 
CGGGCAGCTC 
GCCATTCCGC 
CATTCCGGGA 
GGGCGAACAT 
ATCGCGGGAC 
ACCATAGATT 
ATCGCCAGAA 
GAAAACGCCA 
AAAGGCAGTT 



ACGACGACGA 
AACAATGGCC 
CATTGATGAA 
TTGAAGCCGA 
CTGACCAAAA 
AGCTGCAGAA 
ATGCCGCTTT 
TTGAATAAAT 
AAATATCGTA 
ACAAGCATGC 
AACACTAAGG 
GGCCGAAGAA 
CTGCAGCAGG 
GACAAGGCCG 
CGCTACGAAC 
ACACCAGAGA 
GCTACTACCG 
TGCCGATCAC 
TGCGCAAAGA 
CTGTTCCAAC 
CGGCTACAAA 
CCGAAAACTT 
TCTTCCGCAG 
GGGTGGTGTC 
CACCGCTCGA 
TCCGTCAGGA 
AACTTATGGA 
AGGTCAGCCG 
ATTACCTTGG 
CTTAACCGCC 
AGATGGTTGC 
ACATCTTTTG 
GGCGTTCGGT 
TCGCCGCCAA 
CTCAATGTCG 
TGCCGTCATC 
ACTCCCTCGG 



TGTTAAAAAA 
AAGAAATCAA 
GACGGCACAA 
CGACTTTAAA 
CCGTCAATGA 
TCTGAAATAG 
AGCAGATACT 
TGGGAGAAAA 
AAAATTGATG 
CGAAGCATTC 
CAGACGAAGC 
ACCAAACAAA 
CAAAGCCGAA 
AAGCTGTCGC 
AAAGATAATA 
AGAGTCTGAC 
AAAAATTGGA 
GATACTCGCC 
AACCCGCCAA 
CTTACAACGT 
TCCGAATCGG 
TGCCGCCAAA 
CCTACCATGT 
GCCGCCGACA 
CCATAAAGAC 
AAAACGAGAA 
AACGGTGACA 
TTTCGACTTT 
AGAGTGGAGA 
TTTCAGACCG 
GAAACGCCAG 
ACAAGCTTCC 
TCAGACGATG 
GCAGGGAAAC 
ACCTGGCCGC 
AGCGGTTCCG 
TATCTTTGGC 



GCTGGCACTG 
CGGTTTCAAA 
TTACCAAAAA 
GGTCTGGGTC 
AAACAAACAA 
AAAAGTTAAC 
GATGCCGCTC 
TATAACGACA 
AAAAATTAGA 
AACGATATCG 
CGTCAAAACC 
ACGTCGATGC 
GCTGCCGCTG 
TGCAAAAGTT 
TTGCTAAAAA 
AGCAAATTTG 
CACACGCTTG 
TGAACGGTTT 
GGCCTTGCAG 
GGGTCGGTTC 
CAGTCGCCAT 
GCAGGCGTGG 
CGGCGTCAAT 
TCGGTGCGGG 
AAAGGTTTGC 
ACTGAAGCTG 
GCCTCAATAC 
ATCCGCCAAA 
GTTCCAAGTA 
AGCAAATACA 
TTCAGAATCG 
CGAAGGCGGC 
CCGGCGGAAA 
GGCAAAATCG 
CGCCGATATC 
TCCTTTACAA 
GGAAAAGCCC 



TGGCCATTGC 
GCTGGAGAGA 
AGACGCAACT 
TGAAAAAAGT 
AACGTCGATG 
AACCAAGTTA 
TGGATGCAAC 
TTTGCTGAAG 
AGCCGTGGCT 
CCGATTCATT 
GCCAATGAAG 
CAAAGTAAAA 
GCACAGCTAA 
ACCGACATCA 
AGCAAACAGT 
TCAGAATTGA 
GCTTCTGCTG 
GGATAAAACA 
AACAAGCCGC 
AATGTAACGG 
CGGTACCGGC 
CAGTCGGCAC 
TACGAGTGGG 
GCTTGCCGAT 
AGTCTTTGAC 
GCGGCACAAG 
GGGCAAATTG 
TCGAAGTGGA 
TACAAACAAA 
AGATTCGGAG 
GCGACATAGC 
AGGGCGACAT 
ACTGACCTAC 
AACATTTGAA 
AAGCCGGATG 
CCAAGCCGAG 
AGGAAGTTGC 
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1161 
liJl 


VjA XV l_ VjVL. W 




ppm ■ArnrTTZC A 
X A X LuoWi 




X X Av-VvV WjV X 


i **oi 


ppratiTPPj^ri a 


i **m 




1401 


PS A O. A Ai 1 « 1 •« 1 


1451 


CZCCZCXZtVP AHA 


1501 


TPPTTTPPPP: 


1551 


PA AAA APT AT 

wuUuuiU JIaX 


1601 


OTAAAP.APAT 


1651 


CJPAAAGPPGA 


1701 


GflTPTPPPAT 


1751 


GTATTOPflPP 


1801 


AAP.AAPGAAA 


1851 


CGAAPGTGGP 


1901 


CAGGCACTGC 


1951 


CAAGCGTTGP 


2001 


CCTGATGCAA 


2051 


AAAACATGCT 


2101 


AACACATATG 


2151 


TATCACAGTC 


2201 


TGTATGGAGA 


2251 


GGAATTACTG 


2301 


TTTCACCCGT 


2351 


CCATCGTAAC 



AATVAIAAAY 
GLGLKKVVTN 
DAALDATTNA 
NDIADSLDET 
AAAGTANTAA 
SKFVRIDGLN 
GLAEQAALSG 
AGVAVGTSSG 
KGLQSLTLDQ 
IRQIEVDGQL 
FRIGDIAGEH 
GKIEHLKSPE 
GKAQEVAGSA 



ACGACGACGA 
AACAATGGCC 
CATTGATGAA 
TTGAAGCCGA 
CTGACCAAAA 
AGCTGCAGAA 
ATGCCGCTTT 
TTGAATAAAT 
AAATATCGTA 
ACAAGCATGC 
AACACTAAGG 
GGCCGAAGAA 
CTGCAGCAGG 
GACAAGGCCG 
CGCTACGAAC 
ACACCAGAGA 
GCTACTACCG 
TGCCGATCAC 
TGCGCAAAGA 
CTGTTCCAAC 
CGGCTACAAA 
CCGAAAACTT 
TCTTCCGCAG 
AGGCGGCACT 
GCAACAGCAG 
ATCAAGAACG 
TGACGTTGCG 
ATCTGCATAC 
ATCAACCTCA 
GGTAGGTATC 
AACTGTATGG 
ACGGCGTATA 
TGAAGCTTCT 
CGGATATCCG 
ATTATTGGCG 
CGATGCGACG 
TGATGGTTGC 
GTGCGCATCG 
CGACCTTTTC 
TCGACTATTC 
CAGAGCGATT 
TTTCATCTTT 
CCCTATTGCC 
GCAGGCGTAG 
ACCGGGTACA 
CCATGTGGTG 
ACAAACCCGA 
CGGCACGGCG 



NNGQEINGFK 
LTKTVNENKQ 
LNKLGENITT 
NTKADEAVKT 
DKAEAVAAKV 
ATTKKLDTRL 
LFQPYNVGRF 
SSAAYHVGVN 
SVRKNEKLKL 
ITLESGEFQV 
TSFDKLPEGG 
LNVDLAAADI 
EVKTVNGIRH 



TGTTAAAaAA 
AAGAAATCAA 
GACGGCACAA 
CGACTTTAAA 
CCGTCAATGA 
TCTGAAATAG 
AGCAGATACT 
TGGGAGAAAA 
AAAATTGATG 
CGAAGCATTC 
CAGACGAAGC 
ACCAAACAAA 
CAAAGCCGAA 
AAGCTGTCGC 
AAAGATAATA 
AGAGTCTGAC 
AAAaATTGGA 
GATACTCGCC 
AACCCGCCAA 
CTTACAACGT 
TCCGAATCGG 
TGCCGCCAAA 
CCTACCATGT 
TCTGCGCCCG 
AGCAACAACA 
AAATGTGCAA 
GTTACAGACA 
CGGAGACTTT 
AACCTGCAAT 
GTCGACACAG 
CAGAAAAGAA 
TGCGGAAGGA 
TTCGACGATG 
CCACGTAAAA 
GGCGTTCCGT 
CTACACATAA 
AGCCATCCGC 
TCAATAACAG 
CAAATAGCCA 
CGGCGGTGAT 
ACX3GCAACCT 
TCGACAGGCA 
ATTTTATGAA 
ACCGCAGTGG 
GAACCGCTTG 
CCTCTCGGCA 
TTCAAATTGC 
GCTCTGCTGC 



AGETIYDIDE 
NVDAKVKAAE 
FAEETKTNIV 
ANEAKQTAEE 
TDIKADIATN 
ASAEKSIADH 
NVTAAVGGYK 
YEWGSGGGGV 
AAQGAEKTYG 
YKQSHSALTA 
RATYRGTAFG 
KPDGKRHAVI 
IGLAAKQLEH 



DGTITKKDAT 
SEIEKLTTKL 
KIDEKLEAVA 
TKQNVDAKVK 
KDNIAKKANS 
DTRLNGLDKT 
SESAVAIGTG 
AADIGAGLAD 
NGDSLNTGKL 
FQTEQIQDSE 
SDDAGGKLTY 
SGSVLYNQAE 
HHHHH* 



GCTGCCACTG 
CGGTTTCAAA 
TTACCAAAAA 
GGTCTGGGTC 
AAACAAACAA 
AAAAGTTAAC 
GATGCCGCTC 
TATAACGACA 
AAAAATTAGA 
AACGATATCG 
CGTCAAAACC 
ACGTCGATGC 
GCTGCCGCTG 
TGCAAAAGTT 
TTGCTAAAAA 
AGCAAATTTG 
CACACGCTTG 
TGAACGGTTT 
GGCCTTGCAG 
GGGTCGGTTC 
CAGTCGCCAT 
GCAGGCGTGG 
CGGCGTCAAT 
ACTTCAATGC 
GCGAAATCAG 
AGACAGAAGC 
GGGATGCCAA 
CCAAACCCAA 
TGAAGCAGGC 
GCGAATCCGT 
CACGGCTATA 
AGCGCCTGAA 
AGGCCGTTAT 
GAAATCGGAC 
GGACGGCAGA 
TGAATACGAA 
AATGCATGGG 
TTTTGGAACA 
ATTCGGAGGA 
AAAACAGACG 
GTCCTACCAC 
ATGACGCACA 
AAAGACGCTC 
AGAAAAGTTC 
AGTA0X3GCTC 
CCCTATGAAG 
CGGAACATCC 
TGCAGAAATA 



TGGCCATTGC 
GCTGGAGAGA 
AGACGCAACT 
TGAAAAAAGT 
AACGTCGATG 
AACCAAGTTA 
TGGATGCAAC 
TTTGCTGAAG 
AGCCGTGGCT 
CCGATTCATT 
GCCAATGAAG 
CAAAGTAAAA 
GCACAGCTAA 
ACCGACATCA 
AGCAAACAGT 
TCAGAATTGA 
GCTTCTGCTG 
GGATAAAACA 
AACAAGCCGC 
AATGTAACGG 
CGGTACCGGC 
CAGTCGGCAC 
TACGAGTGGG 
AGGCGGTACC 
CAGCAGTATC 
ATGCTCTGTG 
AATCAATGCC 
ATGACGCAT A 
TATACAGGAC 
CGGCAGCATA 
ACGAAAATTA 
GACGGAGGCG 
AGAGACTGAA 
ACATCGATTT 
CCTGCAGGCG 
TGATGAAACC 
TCAAGCTGGG 
ACATCGAGGG 
GCAGTACCGC 
AGGGTATCCG 
ATCCGTAATA 
AGCTCAGCCC 
AAAAAGGCAT 
AAACGGGAAA 
CAACCATTGC 
CAAGCGTCCG 
TTTTCCGCAC 
CCCGTGGATG 
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10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



60 



65 



2401 
2451 
2501 
2551 
2601 
2651 
2701 
2751 
2801 
2851 
2901 
2951 
3001 
3051 
3101 
3151 
3201 
3251 
3301 
3351 
3401 
3451 
3501 
3551 
3601 
3651 
3701 
3751 
3801 
3851 
3901 
3951 
4001 
4051 
4101 
4151 
4201 
4251 
4301 

1 
51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 
601 
651 
701 
751 
801 
851 
901 
951 
1001 
1051 
1101 
1151 
1201 
1251 
1301 



AGCAACGACA 
TGCAGTCGGC 
AGGCCATGAA 
ACGAAAGGTA 
CACGGGCGGC 
ACAACACCTA 
TACGGCAACA 
TTATAACGGG 
ATCTGGCAGA 
. GGCAGTCTGC 
ACTGCTGAAA 
CGGCACGCGG 
CCCTTCCTGA 
CATCGAAACC 
CAGCGGGCAG 
GCGGCACGGA 
ACACGCCGTA 
TGGATGCCTC 
GCCGACCGCA 
CGCAGCGGCA 
TCAACAGTCT 
GATATGCAGG 
CGGCACGGGT 
GGGAACAGGG 
GGCATTGCCG 
CATGGGACGC 
GCATTAGTCT 
CTCAAAGGCC 
CACCGGTGCG 
AGCTGGGCGC 
TTGACGGTCG 
CGCCGAAAAA 
GCACGCTGGT 
AAAGCCGTCC 
CGACTACACG 
AGACGGGGGC 
GCGGATGTCG 
CGCCGGTTCC 
ACCGGTTCCT 

MATNDDDVKK 
AADVEADDFK 
ADTDAALADT 
DTVDKHAEAF 
AAETAAGKAE 
ADVYTREESD 
VSDLRKETRQ 
FRFTENFAAK 
GIGSNSRATT 
PPPNLHTGDF 
SFPELYGRKE 
AKPTDIRHVK 
KNEMMVAAIR 
QALLDYSGGD 
NTYALLPFYE 
GITAMWCLSA 
SNDNLRTTLL 
TKGTSDIAYS 
YGKNKSDMRV 
GSLQLDGKGT 
PFLSAAKIGQ 
AARTASAAAH 
ADRTDMPGIR 
DMQGRRLKAV 
GIAAKTGENT 
LKGLFSYGRY 
LTVEGGLRYD 



ACCTGCGTAC 
GTGGACAGCA 
CGGACCCGCG 
CATCCGATAT 
CTGATCAAAA 
TACGGGCAAA 
ACAAATCGGA 
GCGGCATCCG 
TACCGACCAA 
AGCTGGACGG 
GTGGACGGTA 
CAAGGGGGCA 
GTGCCGCCAA 
GACGGCGGCC 
TGAAGGCGAC 
CTGCTTCGGC 
GAACAGGGCG 
CGAATCATCC 
CAGATATGCC 
GCCGTACAGC 
CGCCGCTACC 
GACGCCGCCT 
CTGCGCGTCA 
CGGTGTTGAA 
CGAAAACCGG 
AGCACATGGA 
GTTTGCAGGC 
TGTTCTCCTA 
GACGAACATG 
ACTGGGCGGT 
AAGGCGGTCT 
GGCAGTGCTT 
CGGACTCGCG 
TGTTTGCAAC 
GTAACGGGCG 
ACGCAATATG 
AATTCGGCAA 
AAACAGTACG 
CGAGCACCAC 

AATVAIAAAY 
GLGLKKWTN 
DAALDATTNA 
NDIADSLDET 
AAAGTANTAA 
SKFVRIDGLN 
GLAEQAALSG 
AGVAVGTSSG 
AKSAAVSYAG 
PNPNDAYKNL 
HGYNENYKNY 
EIGHIDLVSH 
NAWVKLGERG 
KTDEGIRLMQ 
KDAQKGIITV 
PYEASVRFTR 
TTAQDIGAVG 
FRNDISGTGG 
ETKGALIYNG 
LYTRLGKLLK 
DYSFFTNIET 
SAPAGLKHAV 
PYGATFRAAA 
SDGLDHNGTG 
TAAATLGMGR 
KNSISRSTGA 
LLKQDAFAEK 



CACGTTGCTG 
AGTTCGGCTG 
TCCTTTCCGT 
TGCCTACTCC 
AAGGCGGCAG 
ACCATTATCG 
TATGCGCGTC 
GCGGCAGCCT 
TCCGGCGCAA 
CAAAGGTACG 
CGGCGATTAT 
GGCTATCTCA 
AATCGGGCAG 
TGCTGGCTTC 
ACGCTGTCCT 
AGCGGCACAT 
GCAGCAATCT 
GCAACACCCG 
GGGCATCCGC 
ATGCGAATGC 
GTCTATGCCG 
GAAAGCCGTA 
TCGCGCAAAC 
GGCAAAATGC 
CGAAAATACG 
GCGAAAACAG 
ATACGGCACG 
CGGACGCTAC 
CGGAAGGCAG 
GTCAACGTTC 
GCGCTACGAC 
TGGGCTGGAG 
GGTCTGAAGC 
GGCGGGCGTG 
GCTTTACCGG 
CCGCACACCC 
CGGCTGGAAC 
GCAACCACAG 
CACCACCACC 

NNGQEINGFK 
LTKTVNENKQ 
LNKLGENITT 
NTKADEAVKT 
DKAEAVAAKV 
ATTEKLDTRL 
LFQPYNVGRF 
SSAAYHVGVN 
IKNEMCKDRS 
INLKPAIEAG 
TAYMRKEAPE 
IIGGRSVDGR 
VRIVNNSFGT 
QSDYGNLSYH 
AGVDRSGEKF 
TNPIQIAGTS 
VDSKFGWGLL 
LIKKGGSQLQ 
AASGGSLNSD 
VDGTAIIGGK 
DGGLLASLDS 
EQGGSNLENL 
AVQHANAADG 
LRVIAQTQQD 
STWSENSANA 
DEHAEGSVNG 
GSALGWSGNS 



ACGACGGCTC 
GGGACTGCTG 
TCGGCGACTT 
TTCCGTAACG 
CCAACTGCAA 
AAGGCGGTTC 
GAAACCAAAG 
GAACAGCGAC 
ACGAAACCGT 
CTGTACACAC 
CGGCGGCAAG 
ACAGTACCGG 
GATTATTCTT 
CCTCGACAGC 
ATTATGTCCG 
TCCGCGCCCG 
GGAAAACCTG 
AGACGGTTGA 
CCCTACGGCG 
CGCCGACGGT 
ACAGTACCGC 
TCGGACGGGT 
CCAACAGGAC 
GCGGCAGTAC 
ACAGCAGCCG 
TGCAAATGCA 
ATGCGGGCGA 
AAAAACAGCA 
CGTCAACGGC 
CGTTTGCCGC 
CTGCTCAAAC 
CGGCAACAGC 
TGTCGCAACC 
GAACGCGACC 
CGCGACTGCA 
GTCTGGTTGC 
GGCTTGGCAC 
CGGACGAGTC 
ACTGA 

AGETIYDIDE 
NVDAKVKAAE 
FAEETKTNIV 
ANEAKQTAEE 
TDIKADXATN 
ASAEKSIADH 
NVTAAVGGYK 
YEWGSGGGGT 
MLCAGRDDVA 
YTGRGVEVGI 
DGGGKDIEAS 
PAGGIAPDAT 
TSRAGTADLF 
IRNKNMLFIF 
KREMYGEPGT 
FSAPIVTGTA 
DAGKAMNGPA 
LHGNNTYTGK 
GIVYLADTDQ 
LYMSARGKGA 
VEKTAGSEGD 
MVELDASBSS 
VRIFNSLAAT 
GGTWEQGGVE 
KTDSISLFAG 
TLMQLGALGG 
LTEGTIiVGLA 



AGGACATCGG 
GATGCGGGTA 
TACCGCCGAT 
ACATTTCAGG 
CTGCACGGCA 
GCTGGTGTTG 
GTGCGCTGAT 
GGCATTGTCT 
ACACATCAAA 
GTTTGGGCAA 
CTGTACATGT 
ACGACGTGTT 
TCTTCACAAA 
GTCGAAAAAA 
TCGCGGCAAT 
CCGGTCTGAA 
ATGGTCGAAC 
AACTGCGGCA 
CAACTTTCCG 
GTACGCATCT 
CGCCCATGCC 
TGGACCACAA 
GGTGGAACGT 
CCAAACCGTC 
CCACACTGGG 
AAAACCGACA 
TATCGGCTAT 
TCAGCCGCAG 
ACGCTGATGC 
AACGGGAGAT 
AGGATGCATT 
CTCACTGAAG 
CTTGAGCGAT 
TGAACGGACG 
GCAACCGGCA 
CGGCCTGGGC 
GTTACAGCTA 
GGCGTAGGCT 



DGTITKKDAT 
SEIEKLTTKL 
KIDEKLEAVA 
TKQNVDAKVK 
KDNIAKKANS 
DTRLNGLDKT 
SESAVAIGTG 
SAPDFNAGGT 
VTDRDAKINA 
VDTGESVGSI 
FDDEAVTETE 
LHIMNTNDET 
QIANSEEQYR 
STGNDAQAQP 
EPLEYGSNHC 
ALLLQKYPWM 
SFPFGDFTAD 
TIIEGGSLVL 
SGANETVHIK 
GYLNSTGRRV 
TLSYYVRRGN 
ATPETVETAA 
VYADSTAAHA 
GKMRGSTQTV 
IRHDAGDIGY 
VNVPFAATGD 
GLKLSQPLSD 



WO 01/64922 



PCT/IB01/00452 



-70- 

1351 KAVLFATAGV ERDLNGRDYT VTGGFTGATA ATGKTGARNM PHTRLVAGLG 

1401 ADVEFGNGWN GLARYSYAGS KQYGNHSGRV GVGYRFLEHH HHHH* 

5 961C-ORF46.1 

1 ATGGCCACAA ACGACGACGA TGTTAAAAAA GCTGCCACTG TGGCCATTGC 

51 TGCTGCCTAC AACAATGGCC AAGAAATCAA CGGTTTCAAA GCTGGAGAGA 

101 CCATCTACGA CATTGATGAA GACGGCACAA TTACCAAAAA AGACGCAACT 

151 GCAGCCGATG TTGAAGCCGA CGACTTTAAA GGTCTGGGTC TGAAAAAAGT 

10 201 CGTGACTAAC CTGACCAAAA CCGTCAATGA AAACAAACAA AACGTCGATG 

251 CCAAAGTAAA AGCTGCAGAA TCTGAAATAG AAAAGTTAAC AACCAAGTTA 

301 GCAGACACTG ATGCCGCTTT AGCAGATACT GATGCCGCTC TGGATGCAAC 

351 CACCAACGCC TTGAATAAAT TGGGAGAAAA TATAACGACA TTTGCTGAAG 

401 AGACTAAGAC AAATATCGTA AAAATTGATG AAAAATTAGA AGCCGTGGCT 

15 451 GATACCGTCG ACAAGCATGC CGAAGCATTC AACGATATCG CCGATTCATT 

501 GGATGAAACC AACACTAAGG CAGACGAAGC CGTCAAAACC GCCAATGAAG 

551 CCAAACAGAC GGCCGAAGAA ACCAAACAAA ACGTCGATGC CAAAGTAAAA 

601 GCTGCAGAAA CTGCAGCAGG CAAAGCCGAA GCTGCCGCTG GCACAGCTAA 

651 TACTGCAGCC GACAAGGCCG AAGCTGTCGC TGCAAAAGTT ACCGACATCA 

20 701 AAGCTGATAT CGCTACGAAC AAAGATAATA TTGCTAAAAA AGCAAACAGT 

751 GCCGACGTGT ACACCAGAGA AGAGTCTGAC AGCAAATTTG TCAGAATTGA 

801 TGGTCTGAAC GCTACTACCG AAAAATTGGA CACACGCTTG GCTTCTGCTG 

851 AAAAATCCAT TGCCGATCAC GATACTCGCC TGAACGGTTT GGATAAAACA 

901 GTGTCAGACC TGCGCAAAGA AACCCGCCAA GGCCTTGCAG AACAAGCCGC 

25 951 GCTCTCCGGT CTGTTCCAAC CTTACAACGT GGGTGGATCC GGAGGAGGAG 

1001 GATCAGATTT GGCAAACGAT TCTTTTATCC GGCAGGTTCT CGACCGTCAG 

1051 CATTTCGAAC CCGACGGGAA ATACCACCTA TTCGGCAGCA GGGGGGAACT 

1101 TGCCGAGCGC AGCGGCCATA TCGGATTGGG AAAAATACAA AGCCATCAGT 

1151 TGGGCAACCT GATGATTCAA CAGGCGGCCA TTAAAGGAAA TATCGGCTAC 

30 1201 ATTGTCCGCT TTTCCGATCA CGGGCACGAA GTCCATTCCC CCTTCGACAA 

1251 CCATGCCTCA CATTCCGATT CTGATGAAGC CGGTAGTCCC GTTGACGGAT 

1301 TTAGCCTTTA CCGCATCCAT TGGGACGGAT ACGAACACCA TCCCGCCGAC 

1351 GGCTATGACG GGCCACAGGG CGGCGGCTAT CCCGCTCCCA AAGGCGCGAG 

1401 GGATATATAC AGCTACGACA TAAAAGGCGT TGCCCAAAAT ATCCGCCTCA 

35 1451 ACCTGACCGA CAACCGCAGC ACCGGACAAC GGCTTGCCGA CCGTTTCCAC 

1501 AATGCCGGTA GTATGCTGAC GCAAGGAGTA GGCGACGGAT TCAAACGCGC 

1551 CACCCGATAC AGCCCCGAGC TGGACAGATC GGGCAATGCC GCCGAAGCCT 

1601 TCAACGGCAC TGCAGATATC GTTAAAAACA TCATCGGCGC GGCAGGAGAA 

1651 ATTGTCGGCG CAGGCGATGC CGTGCAGGGC ATAAGCGAAG GCTCAAACAT 

40 1701 TGCTGTCATG CACGGCTTGG GTCTGCTTTC CACCGAAAAC AAGATGGCGC 

1751 GCATCAACGA TTTGGCAGAT ATGGCGCAAC TCAAAGACTA TGCCGCAGCA 

1801 GCCATCCGCG ATTGGGCAGT CCAAAACCCC AATGCCGCAC AAGGCATAGA 

1851 AGCCGTCAGC AATATCTTTA TGGCAGCCAT CCCCATCAAA GGGATTGGAG 

1901 CTGTTCGGGG AAAATACGGC TTGGGCGGCA TCACGGCACA TCCTATCAAG 

45 1951 CGGTCGCAGA TGGGCGCGAT CGCATTGCCG AAAGGGAAAT CCGCCGTCAG 

2001 CGACAATTTT GCCGATGCGG CATACGCCAA ATACCCGTCC CCTTACCATT 

2051 CCCGAAATAT CCGTTCAAAC TTGGAGCAGC GTTACGGCAA AGAAAACATC 

2101 ACCTCCTCAA CCGTGCCGCC GTCAAACGGC AAAAATGTCA AACTGGCAGA 

2151 CCAACGCCAC CCGAAGACAG GCGTACCGTT TGACGGTAAA GGGTTTCCGA 

50 2201 ATTTTGAGAA GCACGTGAAA TATGATACGC TCGAGCACCA CCACCACCAC 

2251 CACTGA 

1 MATNDDDVKK AATVAIAAAY NNGQBINGFK AGETIYDIDE DGTITKKDAT 

51 AADVEADDFK GLGLKKWTN LTKTVNENKQ NVDAKVKAAE SEIEKLTTKL 

55 101 ADTDAALADT DAALDATTNA LNKLGENITT FAEETKTNIV KIDEKLEAVA 

151 DTVDKHAEAF NDIADSLDET NTKADEAVKT ANEAKQTAEE TKQNVDAKVK 

201 AAETAAGKAE AAAGTANTAA DKAEAVAAKV TDIKADIATN KDNIAKKANS 

251 ADVYTREESD SKFVRIDGLN ATTEKLDTRL ASAEKSIADH DTRLNGLDKT 

301 VSDLRKETRQ GLAEQAALSG LFQPYNVGGS GGGGSDLAND SFIRQVLDRQ 

60 351 HFEPDGKYHL FGSRGELAER SGHIGLGKIQ SHQLGNLMIQ QAAIKGNIGY 

401 IVRFSDHGHE VHSPFDNHAS HSDSDEAGSP VDGFSLYRIH WDGYEHHPAD 

451 GYDGPQGGGY PAPKGARDIY SYDIKGVAQN IRLNLTDNRS TGQRLADRFH 

501 NAGSMLTQGV GDGFKRATRY SPELDRSGNA AEAFNGTADI VKNIIGAAGE 

551 IVGAGDAVQG ISEGSNIAVM HGLGLLSTEN KMARINDLAD MAQLKDYAAA 

65 601 AIRDWAVQNP NAAQGIEAVS NIPMAAIPIK GIGAVRGKYG LGGITAHPIK 

651 RSQMGAIALP KGKSAVSDNF ADAAYAKYPS PYHSRNIRSN LEQRYGKENI 

701 TSSTVPPSNG KNVKLADQRH PKTGVPFDGK GFPNFEKHVK YDTLEHHHHH 
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751 H* 



10 



15 



20 



25 



30 



35 



40 



961C-7A1 

1 ATGGCCACAA 

51 TGCTGCCTAC 

101 CCATCTACGA 

151 GCAGCCGATG 

201 CGTGACTAAC 

251 CCAAAGTAAA 

301 GCAGACACTG 

351 CACCAACGCC 

401 AGACTAAGAC 

451 GATACCGTCG 

501 GGATGAAACC 

551 CCAAACAGAC 

601 GCTGCAGAAA 

651 TACTGCAGCC 

701 AAGCTGATAT 

751 GCCGACGTGT 

801 TGGTCTGAAC 

851 AAAAATCCAT 

901 GTGTCAGACC 

951 GCTCTCCGGT 

1001 GTGTCGCCGC 

1051 CTCGACCATA 

1101 CAGGAAAAAC 

1151 ATGGAAACGG 

1201 AGCCGTTTCG 

1251 CTTGGAGAGT 

1301 CCGCCTTTCA 

1351 GTTGCGAAAC 

1401 TTTTGACAAG 

1451 TCGGTTCAGA 

1501 GCCAAGCAGG 

1551 TGTCGACCTG 

1601 TCATCAGCGG 

1651 CTCGGTATCT 

1701 GAAAACCGTA 

1751 AGCACCACCA 



ACGACGACGA 
AACAATGGCC 
CATTGATGAA 
TTGAAGCCGA 
CTGACCAAAA 
AGCTGCAGAA 
ATGCCGCTTT 
TTGAATAAAT 
AAATATCGTA 
ACAAGCATGC 
AACACTAAGG 
GGCCGAAGAA 
CTGCAGCAGG 
GACAAGGCCG 
CGCTACGAAC 
ACACCAGAGA 
GCTACTACCG 
TGCCGATCAC 
TGCGCAAAGA 
CTGTTCCAAC 
CGACATCGGT 
AAGACAAAGG 
GAGAAACTGA 
TGACAGCCTC 
ACTTTATCCG 
GGAGAGTTCC 
GACCGAGCAA 
GCCAGTTCAG 
CTTCCCGAAG 
CGATGCCGGC 
GAAACGGCAA 
GCCGCCGCCG 
TTCCGTCCTT 
TTGGCGGAAA 
AACGGCATAC 
CCACCACCAC 



TGTTAAAAAA 
AAGAAATCAA 
GACGGCACAA 
CGACTTTAAA 
CCGTCAATGA 
TCTGAAATAG 
AGCAGATACT 
TGGGAGAAAA 
AAAATTGATG 
CGAAGCATTC 
CAGACGAAGC 
ACCAAACAAA 
CAAAGCCGAA 
AAGCTGTCGC 
AAAGATAATA 
AGAGTCTGAC 
AAAAATTGGA 
GATACTCGCC 
AACCCGCCAA 
CTTACAACGT 
GCGGGGCTTG 
TTTGCAGTCT 
AGCTGGCGGC 
AATACGGGCA 
CCAAATCGAA 
AAGTATACAA 
ATACAAGATT 
AATCGGCGAC 
GCGGCAGGGC 
GGAAAACTGA 
AATCGAACAT 
ATATCAAGCC 
TACAACCAAG 
AGCCCAGGAA 
GCCATATCGG 
TGA 



GCTGCCACTG 
CGGTTTCAAA 
TTACCAAAAA 
GGTCTGGGTC 
AAACAAACAA 
AAAAGTTAAC 
GATGCCGCTC 
TATAACGACA 
AAAAATTAGA 
AACGATATCG 
CGTCAAAACC 
ACGTCGATGC 
GCTGCCGCTG 
TGCAAAAGTT 
TTGCTAAAAA 
AGCAAATTTG 
CACACGCTTG 
TGAACGGTTT 
GGCCTTGCAG 
GGGTGGATCC 
CCGATGCACT 
TTGACGCTGG 
ACAAGGTGCG 
AATTGAAGAA 
GTGGACGGGC 
ACAAAGCCAT 
CGGAGCATTC 
ATAGCGGGCG 
GACATATCGC 
CCTACACCAT 
TTGAAATCGC 
GGATGGAAAA 
CCGAGAAAGG 
GTTGCCGGCA 
CCTTGCCGCC 



TGGCCATTGC 
GCTGGAGAGA 
AGACGCAACT 
TGAAAAAAGT 
AACGTCGATG 
AACCAAGTTA 
TGGATGCAAC 
TTTGCTGAAG 
AGCCGTGGCT 
CCGATTCATT 
GCCAATGAAG 
CAAAGTAAAA 
GCACAGCTAA 
ACCGACATCA 
AGCAAACAGT 
TCAGAATTGA 
GCTTCTGCTG 
GGATAAAACA 
AACAAGCCGC 
GGAGGGGGTG 
AACCGCACCG 
ATCAGTCCGT 
GAAAAAACTT 
CGACAAGGTC 
AGCTCATTAC 
TCCGCCTTAA 
CGGGAAGATG 
AACATACATC 
GGGACGGCGT 
AGATTTCGCC 
CAGAACTCAA 
CGCCATGCCG 
CAGTTACTCC 
GCGCGGAAGT 
AAGCAACTCG 
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MATNDDDVKK 
AADVEADDFK 
ADTDAALADT 
DTVDKHAEAF 
AAETAAGKAE 
ADVYTREESD 
VSDLRKETRQ 
LDHKDKGIiQS 
SRFDFIRQIB 
VAKRQFRIGD 
AKQGNGKIEH 
LGIFGGKAQE 



AATVAIAAAY 
GLGIiKKWTN 
DAALDATTNA 
NDIADSLDET 
AAAGTANTAA 
SKFVRIDGLN 
GLAEQAALSG 
LTLDQSVRKN 
VDGQLITLES 
IAGEHTSFDK 
LKSPELNVDL 
VAGSAEVKTV 



NNGQEINGFK 
LTKTVNENKQ 
LNKLGENITT 
NTKADEAVKT 
DKAEAVAAKV 
ATTEKLDTRL 
LFQPYNVGGS 
EKLKLAAQGA 
GEFQVYKQSH 
LPEGGRATYR 
AAADIKPDGK 
NGIRHIGLAA 



961C-983 

1 ATGGCCACAA 

51 TGCTGCCTAC 

101 CCATCTACGA 

151 GCAGCCGATG 

201 CGTGACTAAC 

251 CCAAAGTAAA 

301 GCAGACACTG 

351 CACCAACGCC 

401 AGACTAAGAC 

451 GATACCGTCG 

501 GGATGAAACC 



ACGACGACGA 
AACAATGGCC 
CATTGATGAA 
TTGAAGCCGA 
CTGACCAAAA 
AGCTGCAGAA 
ATGCCGCTTT 
TTGAATAAAT 
AAATATCGTA 
ACAAGCATGC 
AACACTAAGG 



TGTTAAAAAA 
AAGAAATCAA 
GACGGCACAA 
CGACTTTAAA 
CCGTCAATGA 
TCTGAAATAG 
AGCAGATACT 
TGGGAGAAAA 
AAAATTGATG 
CGAAGCATTC 
CAGACGAAGC 



AGETIYDIDE 
NVDAKVKAAE 
FAEETKTNIV 
ANEAKQTAEE 
TDIKADIATN 
ASAEKSIADH 
GGGGVAADIG 
EKTYGNGDSL 
SALTAFQTEQ 
GTAFGSDDAG 
RHAVISGSVL 
KQLEHHHHHH 



GCTGCCACTG 
CGGTTTCAAA 
TTACCAAAAA 
GGTCTGGGTC 
AAACAAACAA 
AAAAGTTAAC 
GATGCCGCTC 
TATAACGACA 
AAAAATTAGA 
AACGATATCG 
CGTCAAAACC 



DGTITKKDAT 
SEIEKLTTKL 
KIDEKLEAVA 
TKQNVDAKVK 
KDNIAKKANS 
DTRLNGLDKT 
AGLADALTAP 
NTGKLKNDKV 
IQDSEHSGKM 
GKLTYTIDFA 
YNQAEKGSYS 



TGGCCATTGC 
GCTGGAGAGA 
AGACGCAACT 
TGAAAAAAGT 
AACGTCGATG 
AACCAAGTTA 
TGGATGCAAC 
TTTGCTGAAG 
AGCCGTGGCT 
CCGATTCATT 
GCCAATGAAG 
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CCAAACAGAC 
GCTGCAGAAA 
TACTGCAGCC 
AAGCTGATAT 
GCCGACGTGT 
TGGTCTGAAC 
AAAAATCCAT 
GTGTCAGACC 
GCTCTCCGGT 
GCACTTCTGC 
AGCAGAGCAA 
GAACGAAATG 
TTGCGGTTAC 
CATACCGGAG 
CCTCAAACCT 
GTATCGTCGA 
TATGGCAGAA 
GTATATGCGG 
CTTCTTTCGA 
ATCCGCCACG 
TGGCGGGCGT 
CGACGCTACA 
GTTGCAGCCA 
CATCGTCAAT 
TTTTCCAAAT 
TATTCCGGCG 
CGATTACGGC 
TCTTTTCGAC 
TTGCCATTTT 
CGTAGACCGC 
GTACAGAACC 
TGGTGCCTGT 
CCCGATTCAA 
CGGCGGCTCT 
CGTACCACGT 
CAGCAAGTTC 
CCGCGTCCTT 
GATATTGCCT 
CAAAAAAGGC 
GCAAAACCAT 
TCGGATATGC 
ATCCGGCGGC 
ACCAATCCGG 
GACGGCAAAG 
CGGTACGGCG 
GGGCAGGCTA 
GCCAAAATCG 
CGGCCTGCTG 
GCGACACGCT 
TCGGCAGCGG 
GGGCGGCAGC 
CATCCGCAAC 
ATGCCGGGCA 
ACAGCATGCG 
CTACCGTCTA 
CGCCTGAAAG 
CGTCATCGCG 
TTGAAGGCAA 
ACCGGCGAAA 
ATGGAGCGAA 
CAGGCATACG 
TCCTACGGAC 
ACATGCGGAA 
GCGGTGTCAA 
GGTCTGCGCT 
TGCTTTGGGC 
TCGCGGGTCT 



GGCCGAAGAA 
CTGCAGCAGG 
GACAAGGCCG 
CGCTACGAAC 
ACACCAGAGA 
GCTACTACCG 
TGCCGATCAC 
TGCGCAAAGA 
CTGTTCCAAC 
GCCCGACTTC 
CAACAGCGAA 
TGCAAAGACA 
AGACAGGGAT 
ACTTTCCAAA 
GCAATTGAAG 
CACAGGCGAA 
AAGAACACGG 
AAGGAAGCGC 
CGATGAGGCC 
TAAAAGAAAT 
TCCGTGGACG 
CATAATGAAT 
TCCGCAATGC 
AACAGTTTTG 
AGCCAATTCG 
GTGATAAAAC 
AACCTGTCCT 
AGGCAATGAC 
ATGAAAAAGA 
AGTGGAGAAA 
GCTTGAGTAT 
CGGCACCCTA 
ATTGCCGGAA 
GCTGCTGCAG 
TGCTGACGAC 
GGCTGGGGAC 
TCCGTTCGGC 
ACTCCTTCCG 
GGCAGCCAAC 
TATCGAAGGC 
GCGTCGAAAC 
AGCCTGAACA 
CGCAAACGAA 
GTACGCTGTA 
ATTATCGGCG 
TCTCAACAGT 
GGCAGGATTA 
GCTTCCCTCG 
GTCCTATTAT 
CACATTCCGC 
AATCTGGAAA 
ACCCGAGACG 
TCCGCCCCTA 
AATGCCGCCG 
TGCCGACAGT 
CCGTATCGGA 
CAAACCCAAC 
AATGCGCGGC 
ATACGACAGC 
AACAGTGCAA 
GCACGATGCG 
GCTACAAAAA 
GGCAGCGTCA 
CGTTCCGTTT 
ACGACCTGCT 
TGGAGCGGCA 
GAAGCTGTCG 



ACCAAACAAA 
CAAAGCCGAA 
AAGCTGTCGC 
AAAGATAATA 
AGAGTCTGAC 
AAAAATTGGA 
GATACTCGCC 
AACCCGCCAA 
CTTACAACGT 
AATGCAGGCG 
ATCAGCAGCA 
GAAGCATGCT 
GCCAAAATCA 
CCCAAATGAC 
CAGGCTATAC 
TCCGTCGGCA 
CTATAACGAA 
CTGAAGACGG 
GTTATAGAGA 
CGGACACATC 
GCAGACCTGC 
ACGAATGATG 
ATGGGTCAAG 
GAACAACATC 
GAGGAGCAGT 
AGACGAGGGT 
ACCACATCCG 
GCACAAGCTC 
CGCTCAAAAA 
AGTTCAAACG 
GGCTCCAACC 
TGAAGCAAGC 
CATCCTTTTC 
AAATACCCGT 
GGCTCAGGAC 
TGCTGGATGC 
GACTTTACCG 
TAACGACATT 
TGCAACTGCA 
GGTTCGCTGG 
CAAAGGTGCG 
GCGACGGCAT 
ACCGTACACA 
CACACGTTTG 
GCAAGCTGTA 
ACCGGACGAC 
TTCTTTCTTC 
ACAGCGTCGA 
GTCCGTCGCG 
GCCCGCCGGT 
ACCTGATGGT 
GTTGAAACTG 
CGGCGCAACT 
ACGGTGTACG 
ACCGCCGCCC 
CGGGTTGGAC 
AGGACGGTGG 
AGTACCCAAA 
AGCCGCCACA 
ATGCAAAAAC 
GGCGATATCG 
CAGCATCAGC 
ACGGCACGCT 
GCCGCAACGG 
CAAACAGGAT 
ACAGCCTCAC 
CAACCCTTGA 



ACGTCGATGC 
GCTGCCGCTG 
TGCAAAAGTT 
TTGCTAAAAA 
AGCAAATTTG 
CACACGCTTG 
TGAACGGTTT 
GGCCTTGCAG 
GGGTGGATCC 
GTACCGGTAT 
GTATCTTACG 
CTGTGCCGGT 
ATGCCCCCCC 
GCATACAAGA 
AGGACGCGGG 
GCATATCCTT 
AATTACAAAA 
AGGCGGTAAA 
CTGAAGCAAA 
GATTTGGTCT 
AGGCGGTATT 
AAACCAAGAA 
CTGGGCGAAC 
GAGGGCAGGC 
ACCGCCAAGC 
ATCCGCCTGA 
TAATAAAAAC 
AGCCCAACAC 
GGCATTATCA 
GGAAATGTAT 
ATTGCGGAAT 
GTCCGTTTCA 
CGCACCCATC 
GGATGAGCAA 
ATCGGTGCAG 
GGGTAAGGCC 
CCGATACGAA 
TCAGGCACGG 
CGGCAACAAC 
TGTTGTACGG 
CTGATTTATA 
TGTCTATCTG 
TCAAAGGCAG 
GGCAAACTGC 
CATGTCGGCA 
GTGTTCCCTT 
ACAAACATCG 
AAAAACAGCG 
GCAATGCGGC 
CTGAAACACG 
CGAACTGGAT 
CGGCAGCCGA 
TTCCGCGCAG 
CATCTTCAAC 
ATGCCGATAT 
CACAACGGCA 
AACGTGGGAA 
CCGTCGGCAT 
CTGGGCATGG 
CGACAGCATT 
GCTATCTCAA 
CGCAGCACCG 
GATGCAGCTG 
GAGATTTGAC 
GCATTCGCCG 
TGAAGGCACG 
GCGATAAAGC 



CAAAGTAAAA 
GCACAGCTAA 
ACCGACATCA 
AGCAAACAGT 
TCAGAATTGA 
GCTTCTGCTG 
GGATAAAACA 
AACAAGCCGC 
GGCGGAGGCG 
CGGCAGCAAC 
CCGGTATCAA 
CGGGATGACG 
CCCGAATCTG 
ATTTGATCAA 
GTAGAGGTAG 
TCCCGAACTG 
ACTATACGGC 
GACATTGAAG 
GCCGACGGAT 
CCCATATTAT 
GCGCCCGATG 
CGAAATGATG 
GTGGCGTGCG 
ACTGCCGACC 
GTTGCTCGAC 
TGCAACAGAG 
ATGCTTTTCA 
ATATGCCCTA 
CAGTCGCAGG 
GGAGAACCGG 
TACTGCCATG 
CCCGTACAAA 
GTAACCGGCA 
CGACAACCTG 
TCGGCGTGGA 
ATGAACGGAC 
AGGTACATCC 
GCGGCCTGAT 
ACCTATACGG 
CAACAACAAA 
ACGGGGCGGC 
GCAGATACCG 
TCTGCAGCTG 
TGAAAGTGGA 
CGCGGCAAGG 
CCTGAGTGCC 
AAACCGACGG 
GGCAGTGAAG 
ACGGACTGCT 
CCGTAGAACA 
GCCTCCGAAT 
CCGCACAGAT 
CGGCAGCCGT 
AGTCTCGCCG 
GCAGGGACGC 
CGGGTCTGCG 
CAGGGCGGTG 
TGCCGCGAAA 
GACGCAGCAC 
AGTCTGTTTG 
AGGCCTGTTC 
GTGCGGACGA 
GGCGCACTGG 
GGTCGAAGGC 
AAAAAGGCAG 
CTGGTCGGAC 
CGTCCTGTTT 
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CGACCTGAAC GGACGCGACT ACACGGTAAC 
CTGCAGCAAC CGGCAAGACG GGGGCACGCA 
GTTGCCGGCC TGGGCGCGGA TGTCGAATTC 
GGCACGTTAC AGCTACGCCG GTTCCAAACA 
GAGTCGGCGT AGGCTACCGG TTCCTCGAGC 



NNGQEINGFK AGETIYDIDB DGTITKKDAT 
LTKTVNENKQ NVDAKVKAAE SEIEKLTTKL 
LNKLGENITT FAEETKTNIV KIDEKLEAVA 
NTKADEAVKT ANEAKQTAEE TKQNVDAKVK 
DKAEAVAAKV TDIKADIATN KDNIAKKANS 
ATTEKLDTRL ASAEKSIADH DTRLNGLDKT 
LFQPYNVGGS GGGGTSAPDF NAGGTGIGSN 
CKDRSMLCAG RDDVAVTDRD AKINAPPPNL 
AIEAGYTGRG VEVGIVDTGE SVGSISFPEL 
KEAPEDGGGK DIEASFDDEA VIETEAKPTD 
SVDGRPAGGI APDATLHIMN TNDETKNEMM 
NSFGTTSRAG TADLFQIANS EEQYRQALLD 
NLSYHIRNKN MLFIFSTGND AQAQPNTYAL 
SGEKFKREMY GEPGTEPLEY GSNHCGITAM 
IAGTSFSAPI VTGTAALLLQ KYPWMSNDNL 
GWGLLDAGKA MNGPASFPFG DFTADTKGTS 
GSQLQLHGNN TYTGKTIIEG GSLVLYGNNK 
SLNSDGIVYL ADTDQSGANE TVHIKGSLQL 
IIGGKLYMSA RGKGAGYLNS TGRRVPFLSA 
ASLDSVEKTA GSEGDTLSYY VRRGNAARTA 
NLENLMVELD ASESSATPET VETAAADRTD 
NAADGVRIFN SLAATVYADS TAAHADMQGR 
QTQQDGGTWE QGGVEGKMRG STQTVGIAAK 
NSANAKTDSI SLFAGIRHDA GDIGYLKGLF 
GSVNGTLMQL GALGGVNVPF AATGDLTVEG 
WSGNSLTEGT LVGLAGLKLS QPLSDKAVLF 
TGATAATGKT GARNMPHTRL VAGLGADVEF 
HSGRVGVGYR FLEHHHHHH* 



AGTACTGACC ACAGCCATCC TTGCCACTTT 
CCACAAACGA CGACGATGTT AAAAAAGCTG 
GCCTACAACA ATGGCCAAGA AATCAACGGT 
CTACGACATT GATGAAGACG GCACAATTAC 
CCGATGTTGA AGCCGACGAC TTTAAAGGTC 
ACTAACCTGA CCAAAACCGT CAATGAAAAC 
AGTAAAAGCT GCAGAATCTG AaATAGAAAA 
ACACTGATGC CGCTTTAGCA GATACTGATG 
AACGCCTTGA ataaattggg AGAAAATATA 
TAAGACAAAT ATCGTAAAAA TTGATGAAAA 
CCGTCGACAA GCATGCCGAA GCATTCAACG 
GAAACCAACA CTAAGGCAGA CGAAGCCGTC 
ACAGACGGCC GAAGAAACCA AACAAaACGT 
CAGAAACTGC AGCAGGCAAA GCCGAAGCTG 
GCAGCCGACA AGGCCGAAGC TGTCGCTGCA 
TGATATCGCT ACGAACAAAG ATAATATTGC 
ACGTGTACAC CAGAGAAGAG TCTGACAGCA 
CTGAACGCTA CTACCGAAAA ATTGGACACA 
ATCCATTGCC GATCACGATA CTCGCCTGAA 
CAGACCTGCG CAAAGAAACC CGCCAAGGCC 
TCCGGTCTGT TCCAACCTTA CAACGTGGGT 
AGATTTgGCA AACGATTCTT TTATCCGGCA 
TCGAACCCGA CGGGAAATAC CACCTATTCG 
GAGCGCAGCG GCCATATCGG ATTGGGAAAA 
CAACCTGATG ATTCAACAGG CGGCCATTAA 
TCCGCTTTTC CGATCACGGG CACGAAGTCC 
GCCTCACATT CCGATTCTGA TGAAGCCGGT 
CCTTTACCGC ATCCATTGGG ACX3GATACGA 
ATGACGGGCC ACAGGGCGGC GGCTATCCCG 
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CTCCCAAAGG 
CAAAATATCC 
TGCCGACCGT 
ACGGATTCAA 
AATGCCGCCG 
CGGCGCGGCA 
GCGAAGGCTC 
GAAAACAAGA 
AGACTATGCC 
CCGCACAAGG 
ATCAAAGGGA 
GGCACATCCT 
GGAAATCCGC 
CCGTCCCCTT 
CGGCAAAGAA 
ATGTCAAACT 
GGTAAAGGGT 
CGAG 

MKHFPSKYLT 
FKAGETIYDI 
KQNVDAKVKA 
TTFAEETKTN 
KTANEAKQTA 
KVTDIKADIA 
RLASAEKSIA 
GSGGGGSDLA 
IQSHQLGNLM 
SPVDGFSLYR 
QNIRLNLTDN 
NAABAFNGTA 
ENKMARINDL 
IKGIGAVRGK 
PSPYHSRNIR 
GKGFPNFEKH 



CGCGAGGGAT 
GCCTCAACCT 
TTCCACAATG 
ACGCGCCACC 
AAGCCTTCAA 
GGAGAAATTG 
AAACATTGCT 
TGGCGCGCAT 
GCAGCAGCCA 
CATAGAAGCC 
TTGGAGCTGT 
ATCAAGCGGT 
CGTCAGCGAC 
ACCATTCCCG 
AACATCACCT 
GGCAGACCAA 
TTCCGAATTT 



ATATACAGCT 
GACCGACAAC 
CCGGTAGTAT 
CGATACAGCC 
CGGCACTGCA 
TCGGCGCAGG 
GTCATGCACG 
CAACGATTTG 
TCCGCGATTG 
GTCAGCAATA 
TCGGGGAAAA 
CGCAGATGGG 
AATTTTGCCG 
AAATATCCGT 
CCTCAACCGT 
CGCCACCCGA 
TGAGAAGCAC 



TAILATFCSG 
DEDGTITKKD 
AESEIEKLTT 
IVKIDEKLEA 
EETKQNVDAK 
TNKDNIAKKA 
DHDTRLNGLD 
NDSFIRQVLD 
IQQAAIKGNI 
IHWDGYEHHP 
RSTGQRLADR 
DXVKNIIGAA 
ADMAQLKDYA 
YGLGGITAHP 
SNLEQRYGKE 
VKYDT* 



ALAATNDDDV 
ATAADVEADD 
KLADTDAALA 
VADTVDKHAE 
VKAAETAAGK 
NSADVYTREE 
KTVSDLRKET 
RQHFEPDGKY 
GYIVRFSDHG 
ADGYDGPQGG 
FHNAGSMLTQ 
GEIVGAGDAV 
AAAIRDWAVQ 
IKRSQMGAIA 
NITSSTVPPS 



ACGACATAAA 
CGCAGCACCG 
GCTGACGCAA 
CCGAGCTGGA 
GATATCGTTA 
CGATGCCGTG 
GCTTGGGTCT 
GCAGATATGG 
GGCAGTCCAA 
TCTTTATGGC 
TACGGCTTGG 
CGCGATCGCA 
ATGCGGCATA 
TCAAACTTGG 
GCCGCCGTCA 
AGACAGGCGT 
GTGAAATATG 



KKAATVAIAA 
FKGLGLKKW 
DTDAALDATT 
AFNDIADSLD 
AEAAAGTANT 
SDSKFVRIDG 
RQGLAEQAAL 
HLFGSRGEIiA 
KEVHSPFDNH 
GYPAPKGARD 
GVGDGFKRAT 
QGISEGSNIA 
NPNAAQGIEA 
LPKGKSAVSD 
NGKNVKLADQ 



AGGCGTTGCC 
GACAACGGCT 
GGAGTAGGCG 
CAGATCGGGC 
AAAACATCAT 
CAGGGCATAA 
GCTTTCCACC 
CGCAACTCAA 
AACCCCAATG 
AGCCATCCCC 
GCGGCATCAC 
TTGCCGAAAG 
CGCCAAATAC 
AGCAGCGTTA 
AACGGCAAAA 
ACCGTTTGAC 
ATACGTAACT 



AYNNGQEING 
TNLTKTVNEN 
NALNKLGENI 
ETNTKADEAV 
AADKAEAVAA 
LNATTEKLDT 
SGLFQPYNVG 
ERSGHIGLGK 
ASHSDSDEAG 
IYSYDIKGVA 
RYSPELDRSG 
VMHGLGLLST 
VSNIFMAAIP 
NFADAAYAKY 
RHPKTGVPFD 
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1 ATGAAACACT 

51 CTGTAGCGGC 

101 CCACTGTGGC 

151 TTCAAAGCTG 

201 CAAAAAAGAC 

251 TGGGTCTGAA 

301 AAACAAAACG 

351 GTTAACAACC 

401 CCGCTCTGGA 

451 ACGACATTTG 

501 ATTAGAAGCC 

551 ATATCGCCGA 

601 AAAACCGCCA 

651 CGATGCCAAA 

701 CCGCTGGCAC 

751 AAAGTTACCG 

801 TAAAAAAGCA 

851 AATTTGTCAG 

901 CGCTTGGCTT 

951 CGGTTTGGAT 

1001 TTGCAGAACA 

1051 GGATCCGGAG 

1101 TGCACTAACC 

1151 CGCTGGATCA 

1201 GGTGCGGAAA 

1251 GAAGAACGAC 

1301 ACGGGCAGCT 

1351 AGCCATTCCG 

1401 GCATTCCGGG 



TTCCATCCAA 
GCACTGGCAG 
CATTGCTGCT 
GAGAGACCAT 
GCAACTGCAG 
AAAAGTCGTG 
TCGATGCCAA 
AAGTTAGCAG 
TGCAACCACC 
CTGAAGAGAC 
GTGGCTGATA 
TTCATTGGAT 
ATGAAGCCAA 
GTAAAAGCTG 
AGCTAATACT 
ACATCAAAGC 
AACAGTGCCG 
AATTGATGGT 
CTGCTGAAAA 
AAAACAGTGT 
AGCCGCGCTC 
GGGGTGGTGT 
GCACCGCTCG 
GTCCGTCAGG 
AAACTTATGG 
AAGGTCAGCC 
CATTACCTTG 
CCTTAACCGC 
AAGATGGTTG 



AGTACTGACC 
CCACAAACGA 
GCCTACAACA 
CTACGACATT 
CCGATGTTGA 
ACTAACCTGA 
AGTAAAAGCT 
ACACTGATGC 
AACGCCTTGA 
TAAGACAAAT 
CCGTCGACAA 
GAAACCAACA 
ACAGACGGCC 
CAGAAACTGC 
GCAGCCGACA 
TGATATCGCT 
ACGTGTACAC 
CTGAACGCTA 
ATCCATTGCC 
CAGACCTGCG 
TCCGGTCTGT 
CGCCGCCGAC 
ACCATAAAGA 
AAAAACGAGA 
AAACGGTGAC 
GTTTCGACTT 
GAGAGTGGAG 
CTTTCAGACC 
CGAAACGCCA 



ACAGCCATCC 
CGACGATGTT 
ATGGCCAAGA 
GATGAAGACG 
AGCCGACGAC 
CCAAAACCGT 
GCAGAATCTG 
CGCTTTAGCA 
ATAAATTGGG 
ATCGTAAAAA 
GCATGCCGAA 
CTAAGGCAGA 
GAAGAAACCA 
AGCAGGCAAA 
AGGCCGAAGC 
ACGAACAAAG 
CAGAGAAGAG 
CTACCGAAAA 
GATCACGATA 
CAAAGAAACC 
TCCAACCTTA 
ATCGGTGCGG 
CAAAGGTTTG 
AACTGAAGCT 
AGCCTCAATA 
TATCCGCCAA 
AGTTCCAAGT 
GAGCAAATAC 
GTTCAGAATC 



TTGCCACTTT 
AAAAAAGCTG 
AATCAACGGT 
GCACAATTAC 
TTTAAAGGTC 
CAATGAAAAC 
AAATAGAAAA 
GATACTGATG 
AGAAAATATA 
TTGATGAAAA 
GCATTCAACG 
CGAAGCCGTC 
AACAAAACGT 
GCCX3AAGCTG 
TGTCGCTGCA 
ATAATATTGC 
TCTGACAGCA 
ATTGGACACA 
CTCGCCTGAA 
CGCCAAGGCC 
CAACGTGGGT 
GGCTTGCCGA 
CAGTCTTTGA 
GGCGGCACAA 
CGGGCAAATT 
ATCGAAGTGG 
ATACAAACAA 
AAGATTCGGA 
GGCGACATAG 
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10 



15 



20 



1451 
1501 
1551 
1601 
1651 
1701 
1751 
1801 

1 
51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 
601 



CGGGCGAACA 
TATCGCGGGA 
CACCATAGAT 
AATCGCCAGA 
GGAAAACGCC 
GAAAGGCAGT 
CCGGCAGCGC 
GCCGCCAAGC 

MKHFPSKVLT 
FKAGETIYDI 
KQNVDAKVKA 
TTFAEETKTN 
KTANEAKQTA 
KVTDIKADIA 
RLASAEKSIA 
GSGGGGVAAD 
GAEKTYGNGD 
SHSALTAFQT 
YRGTAFGSDD 
GKRHAVISGS 
AAKQIiEHHHH 



TACATCTTTT 
CGGCGTTCGG 
TTCGCCGCCA 
ACTCAATGTC 
ATGCCGTCAT 
TACTCCCTCG 
GGAAGTGAAA 
AACTCGAGCA 

TAILATFCSG 
DEDGTITKKD 
AESEIEKLTT 
IVKIDEKLEA 
EETKQNVDAK 
TNKDNIAKKA 
DHDTRLNGLD 
IGAGLADALT 
SLNTGKLKND 
EQIQDSEHSG 
AGGKLTYTID 
VLYNQAEKGS 
HH* 



GACAAGCTTC 
TTCAGACGAT 
AGCAGGGAAA 
GACCTGGCCG 
CAGCGGTTCC 
GTATCTTTGG 
ACCGTAAACG 
CCACCACCAC 

ALAATNDDDV 
ATAADVEADD 
KLADTDAAIiA 
VADTVDKHAE 
VKAAETAAGK 
NSADVYTREE 
KTVSDLRKET 
APLDHKDKGL 
KVSRFDFIRQ 
KMVAKRQFRI 
FAAKQGNGKI 
YSLGIFGGKA 



CCGAAGGCGG 
GCCGGCGGAA 
CGGCAAAATC 
CCGCCGATAT 
GTCCTTTACA 
CGGAAAAGCC 
GCATACGCCA 
CACCACTGA 

KKAATVAIAA 
FKGLGLKKW 
DTDAALDATT 
AFNDIADSLD 
AEAAAGTANT 
SDSKFVRIDG 
RQGLAEQAAL 
QSLTLDQSVR 
IEVDGQLITL 
GDIAGEHTSF 
EHLKSPELNV 
QEVAGSAEVK 



CAGGGCGACA 
AACTGACCTA 
GAACATTTGA 
CAAGCCGGAT 
ACCAAGCCGA 
CAGGAAGTTG 
TATCGGCCTT 



AYNNGQEING 
TNLTKTVNEN 
NALNKLGENI 
ETNTKADEAV 
AADKAEAVAA 
LNATTEKLDT 
SGLFQPYNVG 
KNEKLKLAAQ 
ESGEFQVYKQ 
DKLPEGGRAT 
DLAAADIKPD 
TVNGIRHIGL 



25 



30 



35 



40 



45 



50 



55 



60 



65 



961CL-983 



1 
51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 
601 
651 
701 
751 
801 
851 
901 
951 
1001 
1051 
1101 
1151 
1201 
1251 
1301 
1351 
1401 
1451 
1501 
1551 
1601 
1651 
1701 
1751 
1801 
1851 
1901 
1951 
2001 
2051 



ATGAAACACT 
CTGTAGCGGC 
CCACTGTGGC 
TTCAAAGCTG 
CAAAAAAGAC 
TGGGTCTGAA 
AAACAAAACG 
GTTAACAACC 
CCGCTCTGGA 
ACGACATTTG 
ATTAGAAGCC 
ATATCGCCGA 
AAAACCGCCA 
CGATGCCAAA 
CCGCTGGCAC 
AAAGTTACCG 
TAAAAAAGCA 
AATTTGTCAG 
CGCTTGGCTT 
CGGTTTGGAT 
TTGCAGAACA 
GGATCCGGCG 
CGGTATCGGC 
CTTACGCCGG 
GCCGGTCGGG 
CCCCCCCCCG 
ACAAGAATTT 
CGCGGGGTAG 
ATCCTTTCCC 
ACAAAAACTA 
GGTAAAGACA 
AGCAAAGCCG 
TGGTCTCCCA 
GGTATTGCGC 
CAAGAACGAA 
GCGAACGTGG 
GCAGGCACTG 
CCAAGCGTTG 
GCCTGATGCA 
AAAAACATGC 
CAACACATAT 
TTATCACAGT 



TTCCATCCAA 
GCACTGGCAG 
CATTGCTGCT 
GAGAGACCAT 
GCAACTGCAG 
AAAAGTCGTG 
TCGATGCCAA 
AAGTTAGCAG 
TGCAACCACC 
CTGAAGAGAC 
GTGGCTGATA 
TTCATTGGAT 
ATGAAGCCAA 
GTAAAAGCTG 
AGCTAATACT 
ACATCAAAGC 
AACAGTGCCG 
AATTGATGGT 
CTGCTGAAAA 
AAAACAGTGT 
AGCCGCGCTC 
GAGGCGGCAC 
AGCAACAGCA 
TATCAAGAAC 
ATGACGTTGC 
AATCTGCATA 
GATCAACCTC 
AGGTAGGTAT 
GAACTGTATG 
TACGGCGTAT 
TTGAAGCTTC 
ACGGATATCC 
TATTATTGGC 
CCGATGCGAC 
ATGATGGTTG 
CGTGCGCATC 
CCGACCTTTT 
CTCGACTATT 
ACAGAGCGAT 
TTTTCATCTT 
GCCCTATTGC 
CGCAGGCGTA 



AGTACTGACC 
CCACAAACGA 
GCCTACAACA 
CTACGACATT 
CCGATGTTGA 
ACTAACCTGA 
AGTAAAAGCT 
ACACTGATGC 
AACGCCTTGA 
TAAGACAAAT 
CCGTCGACAA 
GAAACCAACA 
ACAGACGGCC 
CAGAAACTGC 
GCAGCCGACA 
TGATATCGCT 
ACGTGTACAC 
CTGAACGCTA 
ATCCATTGCC 
CAGACCTGCG 
TCCGGTCTGT 
TTCTGCGCCC 
GAGCAACAAC 
GAAATGTGCA 
GGTTACAGAC 
CCGGAGACTT 
AAACCTGCAA 
CGTCGACACA 
GCAGAAAAGA 
ATGCGGAAGG 
TTTCGACGAT 
GCCACGTAAA 
GGGCGTTCCG 
GCTACACATA 
CAGCCATCCG 
GTCAATAACA 
CCAAATAGCC 
CCGGCGGTGA 
TACGGCAACC 
TTCGACAGGC 
CATTTTATGA 
GACCGCAGTG 



ACAGCCATCC 
CGACGATGTT 
ATGGCCAAGA 
GATGAAGACG 
AGCCGACGAC 
CCAAAACCGT 
GCAGAATCTG 
CGCTTTAGCA 
ATAAATTGGG 
ATCGTAAAAA 
GCATGCCGAA 
CTAAGGCAGA 
GAAGAAACCA 
AGCAGGCAAA 
AGGCCGAAGC 
ACGAACAAAG 
CAGAGAAGAG 
CTACCGAAAA 
GATCACGATA 
CAAAGAAACC 
TCCAACCTTA 
GACTTCAATG 
AGCGAAATCA 
AAGACAGAAG 
AGGGATGCCA 
TCCAAACCCA 
TTGAAGCAGG 
GGCGAATCCG 
ACACGGCTAT 
AAGCGCCTGA 
GAGGCCGTTA 
AGAAATCGGA 
TGGACGGCAG 
ATGAATACGA 
CAATGCATGG 
GTTTTGGAAC 
AATTCGGAGG 
TAAAACAGAC 
TGTCCTACCA 
AATGACGCAC 
AAAAGACGCT 
GAGAAAAGTT 



TTGCCACTTT 
AAAAAAGCTG 
AATCAACGGT 
GCACAATTAC 
TTTAAAGGTC 
CAATGAAAAC 
AAATAGAAAA 
GATACTGATG 
AGAAAATATA 
TTGATGAAAA 
GCATTCAACG 
CGAAGCCGTC 
AACAAAACGT 
GCCGAAGCTG 
TGTCGCTGCA 
ATAATATTGC 
TCTGACAGCA 
ATTGGACACA 
CTCGCCTGAA 
CGCCAAGGCC 
CAACGTGGGT 
CAGGCGGTAC 
GCAGCAGTAT 
CATGCTCTGT 
AAATCAATGC 
AATGACGCAT 
CTATACAGGA 
TCGGCAGCAT 
AACGAAAATT 
AGACGGAGGC 
TAGAGACTGA 
CACATCGATT 
ACCTGCAGGC 
ATGATGAAAC 
GTCAAGCTGG 
AACATCGAGG 
AGCAGTACCG 
GAGGGTATCC 
CATCCGTAAT 
AAGCTCAGCC 
CAAAAAGGCA 
CAAACGGGAA 
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2101 ATGTATGGAG AACCGGGTAC AGAACCGCTT GAGTATGGCT CCAACCATTG 

2151 CGGAATTACT GCCATGTGGT GCCTGTCGGC ACCCTATGAA GCAAGCGTCC 

2201 GTTTCACCCG TACAAACCCG ATTCAAATTG CCGGAACATC CTTTTCCGCA 

2251 CCCATCGTAA CCGGCACGGC GGCTCTGCTG CTGCAGAAAT ACCCGTGGAT 

5 2301 GAGCAACGAC AACCTGCGTA CCACGTTGCT GACGACGGCT CAGGACATCG 

2351 GTGCAGTCGG CGTGGACAGC AAGTTCGGCT GGGGACTGCT GGATGCGGGT 

2401 AAGGCCATGA ACGGACCCGC GTCCTTTCCG TTCGGCGACT TTACCGCCGA 

2451 TACGAAAGGT ACATCCGATA TTGCCTACTC CTTCCGTAAC GACATTTCAG 

2501 GCACGGGCGG CCTGATCAAA AAAGGCGGCA GCCAACTGCA ACTGCACGGC 

10 2551 AACAACACCT ATACGGGCAA AACCATTATC GAAGGCGGTT CGCTGGTGTT 

2601 GTACGGCAAC AACAAATCGG ATATGCGCGT CGAAACCAAA GGTGCGCTGA 

2651 TTTATAACGG GGCGGCATCC GGCGGCAGCC TGAACAGCGA CGGCATTGTC 

2701 TATCTGGCAG ATACCGACCA ATCCGGCGCA AACGAAACCG TACACATCAA 

2751 AGGCAGTCTG CAGCTGGACG GCAAAGGTAC GCTGTACACA CGTTTGGGCA 

15 2801 AACTGCTGAA AGTGGACGGT ACGGCGATTA TCGGCGGCAA GCTGTACATG 

2851 TCGGCACGCG GCAAGGGGGC AGGCTATCTC AACAGTACCG GACGACGTGT 

2901 TCCCTTCCTG AGTGCCGCCA AAATCGGGCA GGATTATTCT TTCTTCACAA 

2951 ACATCGAAAC CGACGGCGGC CTGCTGGCTT CCCTCGACAG CGTCGAAAAA 

3001 ACAGCGGGCA GTGAAGGCGA CACGCTGTCC TATTATGTCC GTCGCGGCAA 

20 3051 TGCGGCACGG ACTGCTTCGG CAGCGGCAC A TTCCGCGCCC GCCGGTCTGA 

3101 AACACGCCGT AGAACAGGGC GGCAGCAATC TGGAAAACCT GATGGTCGAA 

3151 CTGGATGCCT CCGAATCATC CGCAACACCC GAGACGGTTG AAACTGCGGC 

3201 AGCCGACCGC ACAGATATGC CGGGCATCCG CCCCTACGGC GCAACTTTCC 

3251 GCGCAGCGGC AGCCGTACAG CATGCGAATG CCGCCGACGG TGTACGCATC 

25 3301 TTCAACAGTC TCGCCGCTAC CGTCTATGCC GACAGTACCG CCGCCCATGC 

3351 CGATATGCAG GGACGCCGCC TGAAAGCCGT ATCGGACGGG TTGGACCACA 

3401 ACGGCACGGG TCTGCGCGTC ATCGCGCAAA CCCAACAGGA CGGTGGAACG 

3451 TGGGAACAGG GCGGTGTTGA AGGCAAAATG CGCGGCAGTA CCCAAACCGT 

3501 CGGCATTGCC GCGAAAACCG GCGAAAATAC GACAGCAGCC GCCACACTGG 

30 3551 GCATGGGACG CAGCACATGG AGCGAAAACA GTGCAAATGC AAAAACCGAC 

3601 AGCATTAGTC TGTTTGCAGG CATACGGCAC GATGCGGGCG ATATCGGCTA 

3651 TCTCAAAGGC CTGTTCTCCT ACGGACGCTA CAAAAACAGC ATCAGCCGCA 

3701 GCACCGGTGC GGACGAACAT GCGGAAGGCA GCGTCAACGG CACGCTGATG 

3751 CAGCTGGGCG CACTGGGCGG TGTCAACGTT CCGTTTGCCG CAACGGGAGA 

35 3801 TTTGACGGTC GAAGGCGGTC TGCGCTACGA CCTGCTCAAA CAGGATGCAT 

3851 TCGCCGAAAA AGGCAGTGCT TTGGGCTGGA GCGGCAACAG CCTCACTGAA 

3901 GGCACGCTGG TCGGACTCGC GGGTCTGAAG CTGTCGCAAC CCTTGAGCGA 

3951 TAAAGCCGTC CTGTTTGCAA CGGCGGGCGT GGAACGCGAC CTGAACGGAC 

4001 GCGACTACAC GGTAACGGGC GGCTTTACCG GCGCGACTGC AGCAACCGGC 

40 4051 AAGACGGGGG CACGCAATAT GCCGCACACC CGTCTGGTTG CCGGCCTGGG 

4101 CGCGGATGTC GAATTCGGCA ACGGCTGGAA CGGCTTGGCA CGTTACAGCT 

4151 ACGCCGGTTC CAAACAGTAC GGCAACCACA GCGGACGAGT CGGCGTAGGC 

4201 TACCGGTTCT GACTCGAG 

45 1 MKHFPSKVLT TAILATFCSG ALAATNDDDV KKAATVAIAA AYNNGQEING 

51 FKAGETIYDI DEDGTITKKD ATAADVEADD FKGLGLKKW TNLTKTVNEN 

101 KQNVDAKVKA AESEIEKLTT KLADTDAALA DTDAALDATT NALNKLGENI 

151 TTFAEETKTN IVKIDEKLEA VADTVDKHAE AFNDIADSLD ETNTKADEAV 

201 KTANEAKQTA EETKQNVDAK VKAAETAAGK AEAAAGTANT AADKAEAVAA 

50 251 KVTDIKADIA TNKDNIAKKA NSADVYTREE SDSKFVRIDG LNATTEKLDT 

301 RLASAEKSIA DHDTRLNGLD KTVSDLRKET RQGLAEQAAL SGLFQPYNVG 

351 GSGGGGTSAP DFNAGGTGIG SNSRATTAKS AAVSYAGIKN EMCKDRSMLC 

401 AGRDDVAVTD RDAKINAPPP NLHTGDFPNP NDAYKNLINL KPAIEAGYTG 

451 RGVEVGIVDT GESVGSISFP ELYGRKEHGY NENYKNYTAY MRKEAPEDGG 

55 501 GKDIEASFDD EAVIETEAKP TDIRHVKEIG HIDLVSHIIG GRSVDGRPAG 

551 GIAPDATLHI MNTNDETKNB MMVAAIRNAW VKLGERGVRI VNNSFGTTSR 

601 AGTADLFQIA NSEEQYRQAL IjDYSGGDKTD EGIRLMQQSD YGNLSYHIRN 

651 KNMLFIFSTG NDAQAQPNTY ALLPFYEKDA QKGIITVAGV DRSGEKFKRE 

701 MYGEPGTEPL EYGSNHCGIT AMWCLSAPYE ASVRFTRTNP IQIAGTSFSA 

60 751 PIVTGTAAIiL LQKYPWMSND NLRTTLLTTA QDIGAVGVDS KFGWGLLDAG 

801 KAMNGPASFP FGDFTADTKG TSDIAYSFRN DISGTGGLIK KGGSQLQLHG 

851 NNTYTGKTII EGGSLVLYGN NKSDMRVETK GALIYNGAAS GGSLNSDGIV 

901 YLADTDQSGA NETVHIKGSL QLDGKGTLYT RLGKLLKVDG TAI IGGKLYM 

951 SARGKGAGYL NSTGRRVPFL SAAKIGQDYS FFTNIETDGG LLASLDSVEK 

65 1001 TAGSEGDTLS YYVRRGNAAR TASAAAHSAP AGLKHAVEQG GSNLENLMVE 

1051 LDASESSATP ETVETAAADR TDMPGIRPYG ATFRAAAAVQ HANAADGVRI 

1101 FNSLAATVYA DSTAAHADMQ GRRLKAVSDG LDHNGTGLRV IAQTQQDGGT 
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1151 WEQGGVEGKM RGSTQTVGIA AKTGENTTAA ATLGMGRSTW SENSANAKTD 
1201 SISLFAGIRH DAGDIGYLKG LFSYGRYKNS ISRSTGADEH AEGSVNGTLM 
1251 QLGALGGVNV PFAATGDLTV EGGLRYDLLK QDAFAEKGSA LGWSGNSLTE 
1301 GTLVGLAGLK LSQPLSDKAV LFATAGVERD LNGRDYTVTG GFTGATAATG 
5 1351 KTGARNMPHT RLVAGLGADV EFGNGWNGLA RYSYAGSKQY GNHSGRVGVG 

1401 YRF* 

It will be understood that the invention has been described by way of example only and 
modifications may be made whilst remaining within the scope and spirit of the invention. For 
10 instance, the use of proteins from other strains is envisaged [e.g. see WO00/66741 for 
polymorphic sequences for ORF4, ORF40, ORF46, 225, 235, 287, 519, 726, 919 and 953]. 



EXPERIMENTAL DETAILS 

FPLC protein purification 
15 The following table summarises the FPLC protein purification that was used: 



Protein 


PI 


Column 


Buffer 


pH 


Protocol 


121 i untaggw ' 


6.23 


MonoQ 


Tris 


8.0 


A 


128 l unta 86 ed 


5.04 


MonoQ 


Bis-Tris propane 


6.5 


A 


406. 1L 


7.75 


MonoQ 


Diethanolamine 


9.0 


B 


576. 1L 


5.63 


MonoQ 


Tris 


7.5 


B 


^Q^untagged 


8.79 


MonoS 


Hepes 


7.4 


A 


726 unta 8g«« 


4.95 


Hi-trap S 


Bis-Tris 


6.0 


A 


^ ^untagged 


10.5(-leader) 


MonoS 


Bicine 


8.5 


C 


919Lorf4 


10.4(-leader) 


MonoS 


Tris 


8.0 


B 


920L 


6.92(-leader) 


MonoQ 


Diethanolamine 


8.5 


A 


953L 


7.56(-leader) 


MonoS 


MES 


6.6 


D 


(jg2 unta gs ed 


4.73 


MonoQ 


Bis-Tris propane 


6.5 


A 


919-287 


6.58 


Hi-trap Q 


Tris 


8.0 


A 


953-287 


4.92 


MonoQ 


Bis-Tris propane 


6.2 


A 



Buffer solutions included 20-120 mM NaCl, 5.0 mg/ml CHAPS and 10% v/v glycerol. The 
dialysate was centrifuged at 13000g for 20 min and applied to either a mono Q or mono S 
FPLC ion-exchange resin. Buffer and ion exchange resins were chosen according to the pi of 
the protein of interest and the recommendations of the FPLC protocol manual [Pharmacia: 
20 FPLC Ion Exchange and Chromatofocussing; Principles and Methods. Pharmacia 
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Publication]. Proteins were eluted using a step-wise NaCl gradient. Purification was 
analysed by SDS-PAGE and protein concentration determined by the Bradford method. 

The letter in the 'protocol' column refers to the following: 

FPLC-A: Clones 121.1, 128.1, 593, 726, 982, periplasmic protein 920L and hybrid proteins 
5 919-287, 953-287 were purified from the soluble fraction of E.coli obtained after disruption 
of the cells. Single colonies harbouring the plasmid of interest were grown overnight at 37°C 
in 20 ml of LB/Amp (100 pg/ml) liquid culture. Bacteria were diluted 1:30 in 1.0 L of fresh 
medium and grown at either 30°C or 37°C until the OD550 reached 0.6-08. Expression of 
recombinant protein was induced with IPTG at a final concentration of 1.0 mM. After 

10 incubation for 3 hours, bacteria were harvested by centrifugation at 8000g for 15 minutes at 
4°C When necessary cells were stored at -20°C. All subsequent procedures were performed 
on ice or at 4°C. For cytosolic proteins (121.1, 128.1, 593, 726 and 982) and periplasmic 
protein 920L, bacteria were resuspended in 25 ml of PBS containing complete protease 
inhibitor (Boehringer-Mannheim). Cells were lysed by by sonication using a Branson 

15 Sonifier 450. Disrupted cells were centrifuged at 8000g for 30 min to sediment unbroken 
cells and inclusion bodies and the supernatant taken to 35% v/v saturation by the addition of 
3.9 M (NH4) 2 S0 4 . The precipitate was sedimented at 8000g for 30 minutes. The supernatant 
was taken to 70% v/v saturation by the addition of 3.9 M (NHO2SO4 and the precipitate 
collected as above. Pellets containing the protein of interest were identified by SDS-PAGE 

20 and dialysed against the appropriate ion-exchange buffer (see below) for 6 hours or 
overnight. The periplasmic fraction from Exoli expressing 953L was prepared according to 
the protocol of Evans et. al [Infect Jmmun. (1974) 10:1010-1017] and dialysed against the 
appropriate ion-exchange buffer. Buffer and ion exchange resin were chosen according to 
the pi of the protein of interest and the recommendations of the FPLC protocol manual 

25 (Pharmacia). Buffer solutions included 20 mM NaCl, and 10% (v/v) glycerol. The dialysate 
was centrifuged at 13000g for 20 min and applied to either a mono Q or mono S FPLC ion- 
exchange resin. Buffer and ion exchange resin were chosen according to the pi of the protein 
of interest and the recommendations of the FPLC protocol manual (Pharmacia). Proteins 
were eluted from the ion-exchange resin using either step-wise or continuous NaCl 

30 gradients. Purification was analysed by SDS-PAGE and protein concentration determined by 
Bradford method. Cleavage of the leader peptide of periplasmic proteins was demonstrated 
by sequencing the NH 2 -terminus (see below). 
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FPLC-B: These proteins were purified from the membrane fraction of KcolL Single 
colonies harbouring the plasmid of interest were grown overnight at 37°C in 20 ml of 
LB/Amp (100 |ig/ml) liquid culture. Bacteria were diluted 1:30 in L0 L of fresh medium. 
Clones 406.1L and 919LOrf4 were grown at 30°C and Orf25L and 576.1L at 37°C until the 

5 OD 55 o reached 0.6-0.8. In the case of 919LOrf4, growth at 30°C was essential since 
expression of recombinant protein at 37°C resulted in lysis of the cells. Expression of 
recombinant protein was induced with IPTG at a final concentration of 1.0 mM. After 
incubation for 3 hours, bacteria were harvested by centrifugation at 8000g for 15 minutes at 
4°C When necessary cells were stored at -20 °C. All subsequent procedures were performed 

10 at 4°C. Bacteria were resuspended in 25 ml of PBS containing complete protease inhibitor 
(Boehringer-Mannheim) and lysed by osmotic shock with 2-3 passages through a French 
Press. Unbroken cells were removed by centrifugation at 5000g for 15 min and membranes 
precipitated by centrifugation at lOOOOOg (Beckman Ti50, 38000rpm) for 45 minutes. A 
Dounce homogenizer was used to re-suspend the membrane pellet in 7.5 ml of 20 mM Tris- 

15 HC1 (pH 8.0), 1.0 M NaCl and complete protease inhibitor. The suspension was mixed for 2- 
4 hours, centrifuged at lOOOOOg for 45 min and the pellet resuspended in 7.5 ml of 20mM 
Tris-HCl (pH 8.0), 1.0M NaCl, 5.0mg/ml CHAPS, 10% (v/v) glycerol and complete protease 
inhibitor. The solution was mixed overnight, centrifuged at lOOOOOg for 45 minutes and the 
supernatant dialysed for 6 hours against an appropriately selected buffer. In the case of 

20 Orf25.L, the pellet obtained after CHAPS extraction was found to contain the recombinant 
protein. This fraction, without further purification, was used to immunise mice. 

FPLC-C: Identical to FPLC-A, but purification was from the soluble fraction obtained after 
permeabilising E.coli with polymyxin B, rather than after cell disruption. 

FPLC-D: A single colony harbouring the plasmid of interest was grown overnight at 37°C 
25 in 20 ml of LB/Amp (100 |ig/ml) liquid culture. Bacteria were diluted 1:30 in 1.0 L of fresh 
medium and grown at 30°C until the OD550 reached 0.6-0.8. Expression of recombinant 
protein was induced with IPTG at a final concentration of l.OmM. After incubation for 3 
hours, bacteria were harvested by centrifugation at 8000g for 15. minutes at 4°C. When 
necessary cells were stored at -20 °C. All subsequent procedures were performed on ice or at 
30 4°C. Cells were resuspended in 20mM Bicine (pH 8.5), 20mM NaCl, 10% (v/v) glycerol, 
complete protease inhibitor (Boehringer-Mannheim) and disrupted using a Branson Sonifier 
450. The sonicate was centrifuged at 8000g for 30 min to sediment unbroken cells and 
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inclusion bodies. The recombinant protein was precipitated from solution between 35% v/v 
and 70% v/v saturation by the addition of 3.9M (NH 4 ) 2 S0 4 . The precipitate was sedimented 

at 8000g for 30 minutes, resuspended in 20 mM Bicine (pH 8.5), 20 mM NaCl, 10% (v/v) 
glycerol and dialysed against this buffer for 6 hours or overnight. The dialysate was 
5 centrifuged at 13000g for 20 min and applied to the FPLC resin. The protein was eluted from 
the column using a step-wise NaCl gradients. Purification was analysed by SDS-PAGE and 
protein concentration determined by Bradford method. 

Cloning strategy and oligonucleotide design 

Genes coding for antigens of interest were amplified by PCR, using oligonucleotides 
10 designed on the basis of the genomic sequence of N. meningitidis B MC58. Genomic DNA 
from strain 2996 was always used as a template in PCR reactions, unless otherwise specified, 
and the amplified fragments were cloned in the expression vector pET21b+ (Novajgen) to 
express the protein as C-terminal His-tagged product, or in pET-24b+(Novagen) to express 
the protein in 'untagged' form (e.g. AG 287K). 

15 Where a protein was expressed without a fusion partner and with its own leader peptide (if 
present), amplification of the open reading frame (ATG to STOP codons) was performed. 

Where a protein was expressed in 'untagged 5 form, the leader peptide was omitted by 
designing the 5-end amplification primer downstream from the predicted leader sequence. 

The melting temperature of the primers used in PCR depended on the number and type of 
20 hybridising nucleotides in the whole primer, and was determined using the formulae: 



The melting temperatures of the selected oligonucleotides were usually 65-70°C for the 
whole oligo and 50-60°C for the hybridising region alone. 

25 Oligonucleotides were synthesised using a Perkin Elmer 394 DNA/RNA Synthesizer, eluted 
from the columns in 2.0ml NH4OH, and deprotected by 5 hours incubation at 56°C. The 
oligos were precipitated by addition of 0.3M Na-Acetate and 2 volumes ethanol. The 
samples were centrifuged and the pellets resuspended in water. 



T m i = 4(G+C)+2(A+T) 



(tail excluded) 



T m2 = 64.9 + 0.41 (% GC) - 600/N 



(whole primer) 



\ 
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Sequences 


Restriction 
site 


OrflL 


Fwd 


CGCGGATCCGCTAGC-AAAACAACCGACAAACGG 


Nhel 


Rev 


CCCGCTCGAG-TTACCAGCGGTAGCCTA 


Xhol 


Orfl 


Fwd 


CTAGCTAGC-GGACACACTTATTTCGGCATC 


Nhel 


Rev 


CCCGCTCGAG- TTACCAGCGGTAGCCTAATTTG 


Xhol 


OrflLOmpA 


Fwd 




NdeI-(NheI) 


Rev 


CCCGCTCGAG. 


Xhol 


Orf4L 


Fwd 


CGCGGATCCCATATG-AAAACCTTCTTCAAAACC 


Ndel 


Rev 


CCCGC1CGAG-TTATTTGGCTGCGCCTTC 


Xhol 


Orf7-lL 


Fwd 
Rev 


GCGGCAXEAAT-ATGTTGAGAAAATTGTTGAAATGG 
GCGGCCTCGAG-TTATTTTITCAAAATATATTrGC 


Asel 
Xhol 


Or£9-lL 


Fwd 
Rev 


GCGGCCATATG-TTACCTAACCGTTTCAAAATGT 

GCGGCCTCGAG-TTATTTCCGAGGTTTTCGGG 

CGCGGATCCCATATG-ACACGCTTCAAATATTC 


Ndel 
Xhol 


Orf23L 


Fwd 


Ndel 


Rev 


CCCGCTCGAG-TTATTTAAACCGATAGGTAAA 
CGCGGATCCCAIMG-GGCAGGGAAGAACCGC 


Xhol 
Ndel 


Orf25-lHis 


Fwd 


Rev 


GCCCMQCH-ATCGATGGAATAGCCGCG 


Hindm 


Or£29-l b-His 
(MC58) 


Fwd 


CGCGGATCCGCTAGC-AACGGTTTGGATGCCCG 


Nhel 


Rev 


CCCGCTCGAG-TTTGTCTAAGTTCCTGATAT 
CCCGCTCGAG-ATTCCCACCTGCCATC 


Xhol 


Orf29-l b-L 
(MC58) 


Fwd 


CGCGGATCCGCTAGC-ATGAATTTGCCTATTCAAAAAT 


Nhel 


Rev 


CCCGCTQG^G-TTAATTCCCACCTGCCATC 


Xhol 


Or£29-l c-His 
(MC58) 


Fwd 


GGCGGATCCGCTAGC-ATGAATTTGCCTATTCAAAAAT 


Nhel 


Rev 


CCCG£ECGAG-TTGGACGATGCCCGCGA 


Xhol 


Or£29-l c-L 
(MC58) 
Orf25L 

Orf37L 


Fwd 


CGCGGATCCGCTAGC-ATGAATTTGCCTATTCAAAAAT 


Nhel 


Rev 


CCCG£ICGAG-TTATTGGACGATGCCCGC 


Xhol 


Fwd 


CGCGGATCCCATATG-TATCGCAAACTGATTGC 


Ndel 


ivcv 


cccgctcgag-cta ATrnATno a at agcc 


YhnT 


rwu 


cgc^gg atpppatato- a a ap An apahtp a a atp; 


MHpT 


XV.CV 


CCCGCTCGAn~TCA ATA ACCCGCCTTC AG 


yVXIV/X 


Orf38L 


rwu 


CGCGG ATCCC A T ATG- 
TTACGTTTGACTGCTTTAGCCGTATGCACC 


1NUCX 


Rev 


CCCGCTCGAG- 

TTATTTTGCCGCGTTAAAAGCGTCGGCAAC 


Xhol 


Orf40L 


Fwd 


CGCGGATCCCATAJG-AACAAAATATACCGCAT 


Ndel 


Rev 


CCCGCTCGAG-TTACCACTGATAACCGAC 


Xhol 


Orf40.2-His 


Fwd 


CGCGGATCCCATATG-ACCGATGACGACGATTTAT 


Ndel 


Rev 


GCCCAAGCTT-CCACTGATAACCGACAGA 


Hindm 


Orf40.2L 


Fwd 


CGCGGATCCCAJEATQ-AACAAAATATACCGCAT 


Ndel 


Rev 


GCCCAAGCTT-TTACCACTGATAACCGAC 


XT' it |T 

Hindi IT 


Orf46-2L 


Fwd 


GGGAATTCCATATG-GGCATTTCCCGCAAAATATC 


Ndel 


Rev 


CCCGCJEGAG-TTATITACTCCT 


Xhol 


Orf46-2 


Fwd 


GGG AATTCCATATG-TCAGATTTGGCAAACGATTCTr 


Ndel 


Rev 


CCCGCTCGAG-TTATTTACrcCTATAA 


Xhol 


Orf46JLL 


Fwd 


GGGAATTCCATATG-GGCATTTCCCGCAAAATATC 


Ndel 
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Kev 


VAAAJl* 1 \_AJ AO- 1 1ALU lAlLAiAl 1 ICAOOIOO 


AllOl 


ort4o. (H1S-<j5>JJ 


rWQ 


find A A TTPP ATA THP A mTn A A ATATfJATAmA A (1 
OOOAA 1 1 LtA 1 A luLALu I UAAA 1 A 1 OA 1 ALuAAU 


Bamril-JNael 


Rev 


LtLuLlLuAUl 1 lALlLV/IAlAAtuAuulLlLl 1AAL 


Anol 


orf46.1-His 


Fwd 


GGGAATTCCATATGTCAGATTTGGCAAACGATTCTT 


Ndel 


Rev 


CCCGCTCGAGCGTATCATAnTCACGTGC 


Xhol 


orf46.2-His 


Fwd 


GGGAATTCCATATGTCAGATTTGGCAAACGATTCTT 


Ndel 


Rev 


CCCGCTCGAGTTTACTCCTATAACGAGGTCTCTTAAC 


Xhol 


Orf65-l-(His/GST) 
(MCS8) 

Orf72-lL 


Fwd 


CGCGGATCCCATATG-CAAAATGCGTTCAAAATCCC 


BamHI-Ndel 


Rev 


CGCGGATCCCA1A1G-AACAAAATATACCGCAT 
CCCGCTCGAG -TTTGCTTTCGATAGAACGG 


Xhol 


Fwd 


GCGGCCATATG-GTCATAAAATATACAAATTTGAA 


Ndel 


Rev 


GCGGCCrcGAG-TTAGCCTGAGACCTTTGCAAATT 


Xhol 


Orf76-lL 


Fwd 


GCGGCCATATG-AAACAGAAAAAAACCGCTG 


Ndel 


Rev 


GCGGCCTCGAG-TTACGGTITGACACCGTTTTC 


Xhol 


Orf83.1L 
Orf85-2L 


Fwd 


CGCGGATCCCATATG-AAAACCCTGCTCCTC 


Ndel 


Rev 


CCCGCTCGAG-TTATCCTCCTTTGCGGC 


Xhol 


Fwd 


GCGGCXZATATG-GCAAAAATGATGAAATGGG 


Ndel 


Rev 


GCGGCCTCGAG-TTATCGGCGCGGCGGGCC 


Xhol 


Or£91L (MC58) 


Fwd 
Rev 


GCGGCCATATGAAAAAATCCTCCCTCATCA 
GCGGCCTCGAGTTATTTGCCGCCGTTnTGGC 


Ndel 
Xhol 


Orf91-His(MC58) 


Fwd 
Rev 


GCGGCCATATGGCCCCTGCCGACGCGGTAAG 
GCGGCCTCGAGTTrGCCGCCGTITITGGCTITC 


Ndel 
Xhol 


Orf97-lL 


Fwd 
Rev 


gcggccajato-aaacacatactccccctga 
gcggcctcgag-ttattcgcctacggttttttg 


Ndel 
Xhol 


Orfll9L (MC58) 


Fwd 
Rev 


GCGGCCATATGATTTACATCGTACTGTTTC 
GCGGCCTCGAGTTAGGAGAACAGGCGCAATGC 


Ndel 
Xhol 


Orfll9-His(MC58) 


Fwd 
Rev 


GCGGCCATATGTACAACATGTATCAGGAAAAC 
GCGGCCTCGAGGGAGAACAGGCGCAATGCGG 


Ndel 
Xhol 


Orfl37.1 (His- 
GST) (MC58) 


Fwd 


CGCGGATCCGCTAGCTGCGGCACGGCGGG 


BamHI-Nhel 


Rec 


CCCGCTCGAGATAACGGTATGCCGCCAG 


Xhol 


Orfl43-lL 


Fwd 
Rev 


CGCGGATCCCATAIQ-GAATCAACACTTTCAC 
CCCGCTCGAG-TTACACGCGGTTGCTGT 


Ndel 
Xhol 


008 


Fwd 


CGCGGATCCCAEMH-AACAACAGACATTITG 


Ndel 


Rev 


CCCGCimAe-TTACCTGTCCGGTAAAAG 


Xhol 


050-1(48) 


Fwd 


CGCGGATCCGCTAGC-ACCGTCATCAAACAGGAA 


Nhel 


Rev 


CCCGCJGGAG-TCAAGATTCGACGGGGA 


Xhol 


105 


Fwd 


CGCGGATCCCMATG-TCCGCAAACGAATACG 


Ndel 


Rev 


CCCGCTCGAG-TCAGTGTTCTGCCAGTTT 


Xhol 


111L 


Fwd 


CGCGGATCCCATATG-CCGTCTGAAACACG 


Ndel 


Rev 


CCCGCTCGAG-TTAGCGGAGCAGTnTTC 


Xhol 


117-1 


Fwd 


CGCGGATCCCATATG-ACCGCCATCAGCC 


Ndel 


Rev 


CCCGCTCGAG-TTAAAGCCGGGTAACGC 


Xhol 


121-1 


Fwd 


GCGGCCAJATG-GAAACACAGCTITACATCGG 


Ndel 


Rev 


GCGGCCJGGAQ-TCAATAATAATATCCCGCG 


Xhol 
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122-1 


Fwd 
Rev 


GCGGCCATATG-ATTAAAATCCGCAATATCC 


Ndel 


GCGGCCTCGAG-TTAAATCTrGGTAGATTGGATTTGG 


Xhol 


128-1 


Fwd 
Rev 


GCGGCCATATG-ACTGACAACGCACTGCTCC 


Ndel 


GCGGCCTCGAG-TCAGACCGCGTTGTCGAAAC 


Xhol 


148 


Fwd 


CGCGGATCCCATATG-GCGTTAAAAACATCAAA 


Ndel 


Rev 
Fwd 
Rev 


CCCGCTCGAG-TCAGCCCTTCATACAGC 


Xhol 
Asel 
Xhol 


149.1L(MC58) 


GCGGCATTAATGGCACAAACTACACTCAAACC 
GCGGCCTCGAGTTAAAACTTCACGTTCACGCCG 


149.1-His(MC58) 


Fwd 
Rev 


GCGGCATTAATGCATGAAACTGAGCAATCGGTGG 
GCGGCCTCGAGAAACTTCACGTTCACGCCGCCGGTAAA 


Asel 
Xhol 


205(His-GST) 
(MC58) 


Fwd 


CGCGGATCCCATATGGGCAAATCCGAAAATACG 


BamHI-Ndel 

XJCU1XXXX 11UV1 


Rev 


CCCGCTCGAGATAATGGCGGCGGCGG 


Xhol 


206L 


Fwd 


CGCGGATCCCATATG-TTTCCCCCCGACAA 


Ndel 


Rev 


CCCGCJCGAe-TCATTCTGTAAAAAAAGTATG 


Xhol 


214 (His-GST) 
(MC58) 


Fwd 


CGCGGATCCCATATGCTTCAAAGCGACAGCAG 


BamHI-Ndel 


Rev 


CCCGCTCGAGTTCGGATTTTTGCGTACTC 


Xhol 


216 


Fwd 


CGCGGATCCCATATG-GCAATGGCAGAAAACG 


Ndel 


Rev 


CCCGCTCGAG-CTATACAATCCGTGCCG 


Xhol 


225-1L 


Fwd 


cgcggatco:atatg-gattcttttttcaaacc 


Ndel 


Rev 


CCCGCTCGAG-TCAGTrCAGAAAGCGGG 


Xhol 


235L 


Fwd 


CGCGGATCCCATAJG-AAACCTTTGATTTTAGG 


Ndel 


Rev 


CCCGCTCGAG-TTATTTGGGCTGCTCTTC 


Xhol 


243 


Fwd 


CGCGGATCCCATATG-GTAATCGTCTGGTTG 


Ndel 


Rev 


CCCGCTCGAG-CTACGACTTGGTTACCG 


Xhol 


247-1L 


Fwd 


GCGGCCATATG-AGACGTAAAATGCTAAAGCTAC 


Ndel 


Rev 


GCGGCCTCGAG-TCAAAGTGTTCTGTTTGCGC 


Xhol 


264-His 


Fwd 


GCCGCCATAT^-TTGACTTTAACCCGAAAAA 


Ndel 


Rev 


GCCGCGTCGAG-GCCGGCGGTCAATACCGCCCGAA 


Xhol 


270 (His-GST) 
(MC58) 


Fwd 


CGCGGATCCCATATGGCGCAATGCGATTTGAC 


BamHI-Ndel 


Rev 


CCCGCTCGAGTTCGGCGGTAAATGCCG 


Xhol 


27 4T, 


Fwd 


GCGGCCATATG-GCGGGGCCGAT1T1TGT 


Ndel 


J\vV 


CtCCiC\CC w TC t Cr A fr-TT A r VVTC\C f T r WC A GT ATT ATTCJ 


Xhol 




Fwd 


CKZGGCCATATG-AAClTrGCITTATCCGTCA 


Ndel 


Rev 


GCGGCCTCGAG-TTAACGGCAGTATTTGnTAC 


Xhol 




Fwd 


CGCGGATCCCATATGGGTTIYjCGCTTCGGGC 


BamHT 


Rev 


GCCCAAGC w l"ri"rri(JCMMU 4 G<JCGTTrCCG 

vj w\/ivn\j x_> x x x x x x \_*\_» x x x vj v_^x^ v> x x x v^v/vi 


Hindin 

JL JJIHV* M MM. 


286-His 
(MC58) 


Fwd 


CGCGGATCCCATATG-GCCGACCTTTCCGAAAA 


Ndel 


Rev 


CCCGCJCGAG-GAAGCGCGTTCCCAAGC 


Xhol 


286L 
(MC58) 


Fwd 
Rev 


CGCGGATCCCATATG-CACGACACCCGTAC 
CCCGCTCGAG-TTAGAAGCGCGTTCCCAA 


Ndel 
Xhol 


287L 


Fwd 


CTAG£lAjQ£-TTTAAACGCAGCGTAATCGCAATGG 


Nhel 


Rev 


CCCGCTCGAG-TCAATCCTGCTCTTTTTTGCC 


Xhol 
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287 


Fwd 


CTAGCTAGC-GGGGGCGGCGGTGGCG 


Nhel 


Rev 


CCCGCTCGAG-TCAATCCTGCTCTTTTTTGCC 


Xhol 


287LOrf4 


Fwd 


CTAGCTAGCGCTCATCCTCGCCGCC- 
TGCGGGGGCGGCGGT 


Nhel 


Rev 


CCCGCTCGAG-TCAATCCTGCTCll'l'riTGCC 


Xhol 


287-fu 


Fwd 


CGGGGAJCC-GGGGGCGGCGGTGGCG 


BamHI 


Rev 


CCCGCTCGAG-TCAATCCTGCTCrri'l'riGCC 


Xhol 


287-His 


Fwd 


CTAGCTAGC-GGGGGCGGCGGTGGCG 


Nhel 


Rev 


CCCGCTCGAG-ATCCTGCTCTlTrTTGCC * 


Xhol 


287-His(2996) 


Fwd 


CTAGCTAGC-TGCGGGGGCGGCGGTGGCG 


Nhel 


Rev 


CCCGCTCGAG-ATCCTGCTCTTTTTTGCC 


Xhol 


Al 287-His 


Fwd 


CGCGGATCCGCTAGC-CCCGATGTTAAATCGGC 5 


Nhel 


A2 287-His 


Fwd 


CGCGGATCCGCTAGC-CAAGATATGGCGGCAGT 5 


Nhel 


A3 287-His 


Fwd 


CGCGGATCCGCTAGC-GCCGAATCCGCAAATCA * 


Nhel 


A4 287-His 


Fwd 


CGCGCTAGC-GGAAGGGTTGATTTGGCTAATGG * 


Nhel 


A4 287MC58-His 


Fwd 


CGCGCI^GC-GGAAGGGTTGATTTGGCrAATGG § 


Nhel 


287a-His 


Fwd 


CGCCATATG-TTTAAACGCAGCGTAATCGC 


Ndel 


Rev 


CCCGCTCGAG-AAAATTGCTACCGCCATTCGCAGG 


Xhol 


287b-His 


Fwd 


CGCCATATG-GGAAGGGTTGATTTGGCTAATGG 


Ndel 


287b-2996-His 


Rev 


CCCCKnrGAG-CTTGTClTTATAA^ 


Xhol 


287b-MC58-His 


Rev 


CCCGCTX^GAG-TTTATAAAAGATAATATATTGATTGATTCC 


Xhol 


287c-2996-His 


Fwd 


CGCGCTAGC-ATGCCGCTGATTCCCGTCAATC 5 


Nhel 


< 287 unte8ged, (2996) 


Fwd 


CTAGCTAGC-GGGGGCGGCGGTGGCG 


Nhel 


Rev 


CCCGCTCGAG-TCAATCCTGCrclTTTrTGCC 


Xhol 


AG287-His * 


X vvu 


CGCGG ATCCGCTAGC-CCCG ATGTTA A ATCfifiC 


Nhel 

X^IIlvX 


XX V V 


CCCGCTCG AG- ATPCTGCTv "TTTTT1 'HCP 


Xhol 


AG287K(2996) 


Fwd 


CGCGGATCCGCTAGC-CCCGATGTTAAATCGGC 


Nhel 


Rev 


CCCGCTCGAG-TCAATCCTGCTC1TTTTTGCC 


Xhol 


AG287-L 


Fwd 


CGCGGATCCGCTAGC- 

ITTGAACGCAGTGTGATTGCAATGGCnTGTATTTrTGCC 


Nhel 


Rev 


CCCGCTCGAG-TCAATCCTGCTCrri"ri"rGCC 


Xhol 


AG287-Orf4L 


Fwd 


CGCGGATCCGCTAGC- 

AAAACCTIXCTCAAAACCCTTTCCGCCGCCGCACTCGCG 
CTCATCCTCGCCGCCTGC TCGCCCGATGTTAAATCG 


Nhel 


Rev 


CCCGCTCGAG-TCAATCCTGCrCTTTTTTGCC 


Xhol 


292L 


Fwd 


CGCGGATCCCATATG-AAAACCAAGTTAATCAAA 


Ndel 


Rev 


CCCGCTCGAG-TTATTGATTTTTGCGGATGA 


Xhol 


308-1 


Fwd 


CGCGGATCCCATATG-TTAAATCGGGTATTiTATC 


Ndel 


Rev 


CCCX3CTCGM-TTAATCCGCCATTCCCTG 


Xhol 


401L 


Fwd 


GCGGCCATATG-AAATTACAACAATTGGCTG 


Ndel 


Rev 


GCGGCCTCGAG-TTACClTACGTiTTTCAAAG 


Xhol 


406L 


Fwd 


CGCGGATCCCATATG-CAAGCACGGCTGCT 


Ndel 


Rev 


CCCGCTCGAG-TCAAGGTTGTCCTTGTCTA 


Xhol 


502-1L 


Fwd 


cck:ggatccc^iatg-atgaaaccgcacaac 


Ndel 


Rev 


CCCGQGGAG-TCAGTTGCTCAACACGTC 


Xhol 
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502-A (His-GST) 


Fwd 


CGCGGATCCCATATGGTAGACGCGCTTAAGCA 


BamHI-Ndel 


Rev 


CCCGdCGAGAGCTGCATGGCGGCG 


Xhol 


503-1L 


Fwd 


CGCGGATCCCATATG-GCACGGTCGTTATAC 


Ndel 


Rev 


CCCGCTCGAG-CTACCGCGCATTCCTG 


Xhol 


519-1L 


Fwd 


GCGGCCATATG-GAATTTTTCATTATCTTGTT 


Ndel 


Rev 


GCGGCCTCGAG-TTATITGGCGGTTTTGCTGC j 


Xhol 


525-1L 


Fwd 


GCGGCCATATG-AAGTATGTCCGGTTATTTTTC 


Ndel 


Rev 


GCGGCCTCGAG-TTATCGGCTTGTGCAACGG 


Xhol 


529-(His/GST) 
(MC58) 


Fwd 


CGCGGATCCGCTAGC-TCCGGCAGCAAAACCGA 


Bam HI-Nhel 


Rev 


GCCCAAGCTT-ACGCAGTTCGGAATGGAG 


Hindm 


552L 


Fwd 


GCCGCCATATGTTGAATATTAAACTGAAAACCITG 


Ndel 




Rev 


GCCGCCTCGAGTTATTTCTGATGCCTTTTCCC 


Xhol 


556L 


Fwd 


GCCGCCATATGGACAATAAGACCAAACTG 


Ndel 




Rev 


OCCIjCv^ 1 CLtALj 1 1 AALUO 1 UdjlJ ALU 1 1 1 \^ 


AflOI 


557L 


Fwd 


CGCGGATCCCATATG-AACAAACTGTTTCTTAC 


Ndel 




Rev 


CCCGCTCGAG-TCATTCCGCCTTCAGAAA 


Xhol 


564ab-(His/GST) 
(MC58) 


Fwd 


CGCGGATCCCATATG- 

CAAGGTATCGTTGCCGACAAATCCGCACCT 


BamHI-Ndel 


Rev 


CCCGCTCGAG- 

AGCTAATTGTGCTTGGTTTGCAGATAGGAGTT 


Xhol 


564abL (MC58) 


Fwd 


CGCGGATCCCATATG- 

AACCGCACCCTGTACAAAGTTGTATTTAACAAACATC 


Ndel 


Rev 


CCCGCTCGAG- 

TTAAGCTAATTGTGCTTGGTTTGCAGATAGGAGTT 


Xhol 


564b- 

(His/GST)(MC58) 


Fwd 


CGCGGATCCCATATG- 

ACGGGAGAAAATCATGCGGTTTCACTTCATG 


BamHI-Ndel 


Rev 


CCCGCTCGAG- 

AGCTAATTGTGCTTGGTTTGCAGATAGGAGTT 


Xhol 


564c- 
(His/GST)(MC58) 


Fwd 


CGCGGATCCCATATG- 

GTTTCAGACGGCCTATACAACCAACATGGTGAAATT 


BamHI-Ndel 


Rev 


CCCGCTCGAG- 

GCGGTAACTGCCGCTTGCACTGAATCCGTAA 


Xhol 


564bc- 
(His/GST)(MC58) 


Fwd 


CGCGGATCCCATATG- 

ACGGGAGAAAATCATGCGGTTTCACTTCATG 


BamHI-Ndel 


Rev 


CCCGCTCGAG- 

GCGGTAACTGCCGCTTGCACTGAATCCGTAA 


Xhol 


564d- 
(His/GST)(MC58) 


Fwd 


CGCGGATCCCATATG- 

CAAAGCAAAGTCAAAGCAGACCATGCCTCCGTAA 


BamHI-Ndel 


Rev 


CCCGCTCGAG- 

TCTITTCCnTrCAATTATAACTTTAGTAGGT^ 
GTCCCC 


Xhol 


564cd- 
(His/GST)(MC58) 


Fwd 


CGCGGATCCCATATG- 


BamHI-Ndel 


Rev 


CCCGCTCGAG- 

TCTTITCCTTTCAATTATAACTTTAGTAG GTTC AATTTTG 
GTCCCC 


Xhol 


570L 


Fwd 


GCGGCCATATG-ACCCGTTTGACCCGCG 


Ndel 


Rev 


GCGGCCJCGAG-TCAGCGGGCGTTCATTTCTT 


Xhol 


576-1L 


Fwd 


CGCGGATCCCj^m-AACACCATTTTCAAAATC 


Ndel 


Rev 


CCCGCTCGAG-TTAATTTACTTTTTTGATGTCG 


Xhol 
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580L 


Fwd 


GCGGCCATATG-GATTCGCCCAAGGTCGG 


Ndel 


Rev 


GCGGCCTCGAG-CTACACTTCCCCCGAAGTGG 


Xhol 


S83L 


Fwd 


CGCGGATCCCATATG-ATAGTTGACCAAAGCC 


Ndel 


Rev 


CCCGCJ£Gj^-TTATlTTTCCGATTTTTCGG 


Xhol 


593 


Fwd 


GCGGCCATATG-CTTGAACTGAACGGACT 


Ndel 


Rev 


GCGGCCTCGAG-TCAGCGGAAGCGGACGATT 


Xhol 


650 (His-GST) 
(MC58) 


Fwd 


CGCGGATCCCATATGTCCAAACTCAAAACCATCG 


BamHI-Ndel 


Rev 


CCCGCTCGAGGCTTCCAATCAGTTTGACC 


Xhol 


652 


Fwd 


GCGGCCATATG-AGCGC^ATCGTTGATATTTTC 


Ndel 


Rev 


GCGGCCTCGAG-TTATTTGCCCAGTTGGTAGAATG 


Xhol 


664L 


Fwd 


GGGGCCAT^TG-GTGATACATCCGCACTACTTC 


Ndel 


Rev 


GCGGCCTCGAG-TCAAAATCGAGTTTTACACCA 


Xhol 


726 


Fwd 


GCGGCCATATG-ACCATCTATTTCAAAAACGG 


Ndel 


Rev 


GCGGCCTCGAG-TCAGCCGATGTTTAGCGTCCATT 


Xhol 


741-His(MC58) 


Fwd 


CGCGGATCCCATATG-AGCAGCGGAGGGGGTG 


Ndel 


Rev 


CCCGCTCGAG-TTGCTTGGCGGCAAGGC 


Xhol | 


AG741-His(MC58) 


Fwd 


CGCGGATCCCATATG-GTCGCCGCCGACATCG 


Ndel i 


Rev 


CCCGCTjQQM-TTGCTTGGCGGCAAGGC 


Xhol 


686-2-(His/GST) 
(MC58) 


Fwd 


CGCGGATCCCATATG-GGCGGTTCGGAAGOCG 


BamHI-Ndel 


Rev 


CCCGCJCGAG-TTGAACACTGATGTCrTTTCCGA 


Xhol ! 


719-(His/GST) 
(MC58) 


Fwd 


CGCGGATCCGCTAGC-AAACTGTCGTTGGTGTTAAC 


BamHI-Nhel 


Rev 


CCCGCTCGAG-TTGACCCGCTCCACGG 


Xhol 


730-Hls (MC58) 


Fwd 


GCCGCCATATGGCGGACTTGGCGCAAGACCC 


Ndel 




Rev 


GCCGCCTCGAGATCTCCTAAACCTGTTTTAACAATGCCG 


Xhol 


730A-Hls (MC58) 


Fwd 


GCCGCCATATGGCGGACTTGGCGCAAGACCC 


Ndel 




Rev 


GCGGCCTCGAGCTCCATGCTGTTGCCCCAGC 


Xhol 


730B-His (MC58) 


Fwd 


GCCGCCATATGGCGGACTTGGCGCAAGACCC 


Ndel 




Rev 


GCGGCCTCGAGAAAATCCCCGCTAACCGCAG 


Xhol 


741-His 
(MC58) 


Fwd 


CGCGGATCCCATATG-AGCAGCGGAGGGGGTG 


Ndel 


Rev 


CCCGCT£QAjS-TTGCTTGGCGGCAAGGC 


Xhol 


AG741-His 
(MC58) 


Fwd 


CGCGGATCCCATATG-GTCGCCGCCGACATCG 


Ndel 


Rev 


CCCGCTCGAG-TTGCTTGGCGGCAAGGC 


Xhol 


743 (His-GST) 


Fwd 


CGCGGATCCCATATGGACGGTGTTGTGCCTGTT 


BamHI-Ndel 


Rev 


CCCGCTCGAGCTTACGGATCAAATTGACG 


Xhol 


757 (His-GST) 
(MC58) 


Fwd 


CGCGGATCCCATATGGGCAGCCAATCTGAAGAA 


BamHI-Ndel 


Rev 


CCCGCTCGAGCTCAGCTTTTGCCGTCAA 


Xhol 


759-His/GST 
(MC58) 


Fwd 


CGCGGATCCGCTAGC-TACTCATCCATTGTCCGC 


BamHI-Nhel 


Rev 


CCCGCTCGAG-CCAGTTGTAGCCTATTTTG 


Xhol 


759L 
(MC58) 


Fwd 


CGCGGATCCGCTAGC-ATGCGCTTCACACACAC 


Nhel 


Rev 


CCCGCTCGAQ-TTACCAGTTGTAGCCTATTT 


Xhol 


760-His 


Fwd 
Rev 
Fwd 


GCCGCCATATGGCACAAACGGAAGGTTTGGAA 

GCCGCCTCGAGAAAACTGTAACGCAGGTTTGCCGTC 

GCGGCCATATGGAAGAAACACCGCGCGAACCG 


Ndel 
Xhol 
Ndel 




769-His (MC58) 
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Rev 


GCGGCCTCGAGGAACGTTTTATTAAACTCGAC 


Xhol . 


907L 


Fwd 


GCGGCCAX^TG-AGAAAACCGACCGATACCCTA . 


Ndel 


Rev 


GCGGCCTCGAG-TCAACGCCACTGCCAGCGGTTG 


Xhol 


911L 


Fwd 


CGCGGATCCCAIATG-AAGAAGAACATATTGGAATTTTGGGTCGGACTG 


Ndel 


Rev 


CCCGCTCGAG-TTATTCGGCGGCTTTTTCCGCATTGCXrG 


Xhol 


911LOmpA 


Fwd 


GGGAATTCCATATGAAAAAGACAGCTATCGCGATTGCA 
GTGGCACTGGCTGGTTTCGCTACCGTAGCGCAGGCCGC 
TAGC-GCTTTCCGCGTGGCCGGCGGTGC 


NdeI-(NheI) 


Rev 


CCCGCTCGAG-TTATTCGGCGGCTTTITCCGCATTGCCG 


Xhol 


911LPelB 


Fwd 


CATGCCATGG-CTTTCCGCGTGGCCGGCGGTGC 


Ncol 


Rev 


CCCGCTCGAG-TTATTCGGCGGCTTTTTCCGCATTGCCG 


Xhol 


913-His/GST 
(MC58) 


Fwd 
Rev 


CGCGGATCCCATATG-TTTGCCGAAACCCGCC 
CCCGCTCGAG-AGGTTGTGTTCCAGGTTG 


BamHI-Ndel 
Xhol 


913L 
(MC58) 


Fwd 


CGCGGATCCCATATG-AAAAAAACCGCCTATG 


Ndel 


Rev 


CCCGCTCGAG-TTAAGGTTGTGTTCCAGG 


Xhol 


919L 


Fwd 


CGCGGATCCCATATG-AAAAAATACCTATTCCGC 


Ndel 


Rev 


CCCGCTCGAG-TTACGGGCGGTATTCGG 


Xhol 


919 


Fwd 


CGCGGATCCCATATG-CAAAGCAAGAGCATCCAAA 


Ndel 


Rev 


CCCGCTCGAG-TTACGGGCGGTATTCGG 


Xhol 


919L Orf4 


Fwd 


GGGAATTCCATATGAAAACCITCTTCAAAACCCTTTCCG 

CCGCCGCGCTAGCGCTCATCCTCGCCGCC- 

TGCCAAAGCAAGAGCATC 


NdeI-(NheD 


Rev 


CCCGCTCGAG-TTACGGGCGGTATTCGGGCTTCATACCG 


Xhol 


(919)-287fusion 


Fwd 


CGCGGATCCGTCGAC-TGTGGGGGCGGCGGTGGC 


Sail 


Rev 


CCCGCTCGAG-TCAATCCTGCrCTTTTTTGCC 


Xhol 


920-1L 


Fwd 


GCGGCCATATG-AAGAAAACATTGACACTGC 


Ndel 


Rev 


GCGGCCTCGAG-TTAATGGTGCGAATGACCGAT 


Xhol 


925-His/GST 
(MC58) GATE 


Fwd 
Rev 


ggggacaagtttgtacaaaaaagcaggctTGCGGCAAGGATGCCGG 
ggggaccactttgtacaagaaagctgggtCTAAAGCAACAATGCCGG 


attBl 
attB2 




926L 


Fwd 


CGCGGATCCCATATG-AAACACACCGTATCC 


Ndel 


Rev 


CCCGCTCGAG-TTATCTCGTGCGCGCC 


Xhol 


927 -2-fHis/GSTi 
(MC58) 


Fwd 


CGCGGATCCCATATG-AGCCCCGCGCCGATT 


BamHI-Ndel 


Rev 


CCCGCTCGAG-TTTTTGTGCGGTCAGGCG 


Xhol 


932-His/GST 
(MC58) ^ 


Fwd 


ggggacaagtttgtacaaaaaagcaggctTGTTCGTTTGGGGGATTTAA 

APPA A Apr A A ATP 


attBl 


935 (His-GST) 
(MC58) 


For 


CGCGGATCCCATATGGCGGATGCGCCCGCG 


BamHI-Ndel 


Rev 


CCCGCTCGAGAAACCGCCAATCCGCC 


Xhol 




Rev 


ggggaccactttgtacaagaaagctgggtTCATTITGTTTTTCCTTCTTCT 
CGAGGCCATT 


attB2 


rwa 


VAjUUU A 1 CCC A 1 A 1 u- AAAUv^CAAACCuCAC 


iNdel 


Rev 


CCCGCECGAG-TCAGCGTTGGACGTAGT 


Xhol 


953L 


Fwd 


GGGAATTCCAI^IS-AAAAAAATCATCTrCGCCG 


Ndel 


Rev 


CCCGCTCGAG-TTATTGTTTGGCTGCCTCGAT 


Xhol 


953-fu 


Fwd 


GGGAATTCCAl^lG-GCCACCTACAAAGTGGACG 


Ndel 


Rev 


CGGGGATCC-TTGTITGGCTGCCTCGATTTG 


Bamffl 
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954 (His-GST) 
(MC58) 


rwu 


cncnci ATrrrATATnrA aPiA apa atphpapa a ar 


R amTTT-TsIHfT 
d anuii-rN aci 


Rev 


CCCGCTCGAGTTITrTCGGCAAATTGGCTT 


Xhol 


958-His/GST 
(MCSS) 0 * 1 * 


Fwd 
Rev 


ggggacaagtttgtacaaaaaagcaggctGCCGATGCCGTTGCGG 
ggggaccactttgtacaagaaagctgggtTCAGGGTCGTTTGTTGCG 


attBl 
attB2 


961L 


Fwd 


CGCGGATCCCATATG-AAACACTTTCCATCC 


Ndel 


Rev 


CCCGCTCGAG-TTACCACTCGTAATTGAC 


Xhol 


961 


Fwd 


CGCGGATCCCATATG-GCCACAAGCGACGAC 


Ndel 


Rev 


CCCGCTCGAG-TTACCACTCGTAATTGAC 


Xhol 


961 c (His/GST) 


Fwd 


CGCGGATCCCATATG-GCCACAAACGACG 


BamHI-Ndel 


Rev 


CCCGCTCGAG-ACCCACGTTGTAAGGTTG 


Xhol 


961 c-(His/GST) 
(MC58) 


Fwd 


CGCGGATCCCATATG-GCCACAAGCGACGACGA 


BamHI-Ndel 


Rev 


CCCGCTCGAG-ACCCACGTTGTAAGGTTG 


Xhol 


961 c-L 


Fwd 


CGCGGATCCCATATG-ATGAAACACTTTCCATCC 


Ndel 


Rev 


CCCGCTCGAG-TTAACCCACGTTGTAAGGT 


Xhol 


961 c-L 
(MC58) 


Fwd 


CGCGGATCCCAT^T^-ATGAAACACTTTCCATCC 


Ndel 


Rev 


CCCGCTCGAG-TTAACCCACGTTGTAAGGT 


Xhol 


961 d (His/GST) 


Fwd 


CGCGGATCCCATATG-GCCACAAACGACG 


BamHI-Ndel 




Rev 


CCCGCTCGAG-GTCTGACACTGTTTTATCC 


Xhol 


961A1-L 


Fwd 


CGCGGATCCCATATG-ATGAAACACTTTCCATCC 


Ndel 


Rev 


CCCGCTCGAG-TTATGCTTTGGCGGCAAAG 


Xhol 


fu961-... 


Fwd 


CGCGGATCCCATATG- GCCACAAACGACGAC 


Ndel 


Rev 


CGCGGATCC-CCACTCGTAATTGACGCC 


Bamffl 


fu 961-... 
(MC58) 


Fwd 


CGCGGATCCCA1A1Q-GCCACAAGCGACGAC 


Ndel 


Rev 


CGCGGATCC-CCACTCGTAATTGACGCC 


BamHI 


fu961c-... 


Fwd 


CGCGGATCCCAJATG-GCCACAAACGACGAC 


Ndel 




Rev 


CGCGGATCC -ACCCACGTTGTAAGGTTG 


BamHI 


fu 961 c-L-... 


Fwd 


CGCGGATCCCATATG- ATGAAACACTTTCCATCC 


Ndel 


Rev 


CGCGGATCC -ACCCACGTTGTAAGGTTG 


BamHI 


fu (961 )- 
741(MC58)-His 


rwa 


nebcin a tvp nn a riricirirYvcirvvrYTC^ri 




Rev 


CCCGCTCGAG-TTGCTTGGCGGCAAGGC 


Xhol 


fu (Q<t \ Oft 4 * Tlie 

Iu [yox )-yoj"ijJS 


rWO 


ennnn atcc 1 pppppapppphpaptt 


TJomHT 


"D nil 

rvCv 


rT'pnPTPn a n-n a a rrnnT a hppt a pn 




fu (961)- Orf46.1- 
His 


rwa 


TCAGATTTGGCAAACGATTC 


RamHT 
jjrtiiii ii 


Rev 


CCCGCTCGAG-CGTATCATATTTCACGTGC 


Xhol 


fu (961 c-Ly 
741(MC58) 


rwu 


V-VJ\-^VJU/\ I -\J\J/\VJ\JVJVJVJ 1 VJV J 1 VJ 1 V^VJ 


UflllU MX 




Rev 


CCCGCTCG AG-TTATTG CTTG GCGGC AAG 


Xhol 


fu (961c-L)-983 


Fwd 


CGCGGATCC - GGCGGAGGCGGCACTT 


BamHI 


Rev 


CCCGCTCGAQ-TCAGAACCGGTAGCCTAC 


Xhol 


fu (961c-L)- 
Orf46.1 


Fwd 


CGCGGATCCGGTGGTGGTGGT- 
TCAGATTTGGCAAACGATTC 


BamHI 


Rev 


CCCGCTC^AG-TTACGTATCATATTTCACGTGC 


Xhol 


961-(His/GST) 


Fwd 


CGCGGATCCCATATG-GCCACAAGCGACGACG 


BamHI-Ndel 
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(MC58) 


Rev 


CCCGCTCG^G-CCACTCGTAATTGACGCC 


Xhol 


961A1-His 


Fwd 


CGCGGATCCCAXATQ-GCCACAAACGACGAC 


Ndel 




Rev 


CCCGCTCGAG-TGCTTTGGCGGCAAAGTT 


Xhol 


961a-(His/GST) 


Fwd 
Rev 


CGCGGATCCCATAIO-GCCACAAACGACGAC 

CCCUCICAjALt-i 1 1 AvjCAAIAI lAlw 1 lul ItulAuL 


BamHI-Ndel 

"VTinT 

AJ101 


961b-(His/GST) 


T? 1 

Fwd 


COCCju ATCCC A 1 A 1 u- AAAOCAA AUCU 1 OCCU A 


iJaniMi-iNaei 


Rev 


rrpfinY 1 ^ a n or* a ptpht a a tty. a rnrf 

lAAAJl* 1 LuAU-LLAL 1 CO X AA 1 1 UA^OV^ 


An oi 


961-Hls/GST GATE 


Fwd 
Rev 


ggggacaagtttgtacaaaaaagcaggctuCAOCCACAAACuALOACO 
ATGTTAAAAAAGC 

ggggaccactttgtacaagaaagctgggtTTACCACTCGTAATTGACGC 
CGACATGGTAGG 


atttsj 
attB2 




982 


Fwd 


GCGGCCATATG-GCAGCAAAAGACGTACAGTT 


Ndel 


Rev 


GCGGCCTCGAG-TTACATCATGCCGCCCATACCA 


Xhol 


983-His (2996) 


Fwd 


CGCGGATCCGCTAGC-TTAGGCGGCGGCGGAG 


Nhel 


Rev 


CCCGCTCGAG-GAACCGGTAGCCTACG 


Xhol 


AG983-His (2996) 


Fwd 


CCCCTAGCTAGC-ACTTCTGCGCCCGACTT 


Nhel 


Rev 


CCCGCTCGAG-GAACCGGTAGCCTACG 


Xhol 


' 983-His 


Fwd 


CGCGGATCCGCTAGC-TTAGGCGGCGGCGGAG 


Nhel 


Rev 


CCCGCTCGAG-GAACCGGTAGCCTACG 


Xhol 


AG983-His 


Fwd 


CGCGGATCCGCTAGC-ACTTCTGCGCCCGACTT 


Nhel 


Rev 


CCCGCTCGAG-GAACCGGTAGCCTACG 


Xhol 


983L 


Fwd 


CGCGGATCCGCTAGC- 

CGAACGACCCCAACCTTCCCTACAAAAACTTTCAA 


Nhel 


Rev 
Fwd 
Rev 


CCCGCTCGAG-TCAGAACCGACGTGCCAAGCCGTTC 

GCCGCCATATGCCCCCACTGGAAGAACGGACG 

GCCGCCTCGAGTAATAAACCTTCTATGGGCAGCAG 


Xhol 
Ndel 
Xhol 


987-His (MC58) 




989-(His/GST) 
(MC58) 


Fwd 


CGCGGATCCCATATG-TCCGTCCACGCATCCG 


BamHI-Ndel 


Rev 


CCCGCTCGAG-TTTGAATTTGTAGGTGTATTG 
CGCGGATCCCATATfi-ACCCCTTCCGCACT 


Xhol 
Ndel 


989L 
i (MC58) 


Fwd 


Rev 


CCCGCTOGiyQ-TTATTTGAATTTGTAGGTGTAT 


Xhol 


CrgA-His 
(MC58) 


Fwd 


CGCGGATCCCATATG-AAAACCAATTCAGAAGAA 


Ndel 


Rev 
Fwd 


CCCGCTCGAG-TCCACAGAGATTGTTTCC 
GATGCCCGAAGGGCGGG 


Xhol 


PUC1-ES 

(MC58) 


Rev 


GCCCAAGCTT-TCAGAAGAAGACTTCACGC 




paci-His 

(MC58) 


Fwd 


CGCGGATCCCATATG-CAAACCCATAAATACGCTATT 


Ndel 


Rev 


GCCCAAGCTT-GAAGAAGACTTCACGCCAG 


Hindm 


AlPilCl-His 
(MC58) 


Fwd 


CGCGGATCCCATATG-GTCTTTTTCGACAATACCGA 


Ndel 


Rev 


GCCCAAGCTT- 


Hindm 


! PilCIL 
(MC58) 


Fwd 
Rev 


CGCGGATCCCATATG-AATAAAACTTTAAAAAGGCGG 
GCCCAAGCIT-TCAGAAGAAGACTTCACGC 


Ndel 
Hindm 


AGTbp2-His 
(MC58) 


Fwd 


CGCGAATCCCATATG-TTCGATCTTGATTCTGTCGA 


Ndel 


Rev 


CCCGCTCGAG-TCGCACAGGCTGTTGGCG 


Xhol 


Tbp2-ffis 
(MCS8) 


Fwd 


CGCGAATCCCATA1G-TTGGGCGGAGGCGGCAG 


Ndel 


Rev 


CCCGCTCGAG-TCGCACAGGCTGTTGGCG 


Xhol 


Tbp2-His(MC58) 


Fwd 


CGCGAATCCCATATG-TTGGGCGGAGGCGGCAG 


Ndel 


Rev 


CCCGCTCGAG-TCGCACAGGCTGTTGGCG 


Xhol 
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NMB0109- 
(His/GST) 


Fwd 


CGCGGATCCCATATG-GCAAATTTGGAGGTGCGC 


BamHI-Ndel 


(MCS8) 


Rev 


CCCGCTCGAG-TTCGGAGCGGTTGAAGC 


Xhol 


NMB0109L 
(MCSS) 


Fwd 


CGCGGATCCCATATG-CAACGTCGTATTATAACCC 


Ndel 


Rev 


CCCGCTCGAG-TTATTCGGAGCGGTTGAAG 


Xhol 


NMB0207- 
(His/GST) 
(MC58) 


Fwd 


CGCGGATCCCATATG- 

GGC ATC A A A GTCGCC ATC A A CfiOCT A C 


BamHI-Ndel 


Rev 


CCCGCTCGAG-TTTGAGCGGGCGCACTTCAAGTCCG 


Xhol 


NMB0462- 

(His/GST) 
(MC58) 


Fwd 


CGCGGATCCCATATG-GGCGGCAGCGAAAAAAAC 


BamHI-Ndel 


Rev 


CCCGCTCGAG-GTTGGTGCCGACnTGAT 


Xhol 


NMB0623- 
(His/GST) 
(MC58) 


Fwd 


CGCGGATCCCATATG-GGCGGCGGAAGCGATA 


BamHI-Ndel 


Rev 


CCCGCICGAG-TTTGCCCGCTTTGAGCC 


Xhol 


NMB0625 (His- 
GST)(MC58) 


Fwd 


CGCGGATCCCATATGGGCAAATCCGAAAATACG 


BamHI-Ndel 


Rev 


CCCGCTCGAGCATCCCGTACrGITTCG 


Xhol 


NMB0634 
(His/GST)(MC58) 


Fwd 


ggggacaagtttgtacaaaaaagcaggctCCGACATTACCGTGTACAAC 
GGCCAACAAAGAA 


attBI 


Rev 


ggggaccacmgtacaagaaagctgggtCTTATTTCATACCGWjCTTGCT 
CAAGCAGCCGG 


attB2 


NMB0776- 
His/GST (MC58) 

GATE 


Fwd 
Rev 


ggggacaagtttgtacaaaaaagcaggctGATACGGTGTTTTCCTGTAA 
AACGGACAACAA 

ggggaccactttgtacaagaaagctgggtCTAGGAAAAATCGTCATCGT 
TGAAATTCGCC 


attBI 
attBI 


NMB1115- 
Hls/GST (MC58) 

GATE 


Fwd 
Rev 


ggggacaagtttgtacaaaaaagcaggctATGCACCCCATCGAAACC 
eeeHaccacmfitacaagaaaectggEtCTAGTCTTGCAGTGCCTC 


attBI 
aUB2 


NMB1343- 
(His/GST) 
(MC58) 


Fwd 


CGCGGATCCCATATG- 
GGAAATTTCTTATATAGAGGCATTAG 


BamHI-Ndel 


Rev 


CCCGCTCGAG- 

nTTA ATTTPTATrA ArTPTTTAnPA ATA AT* 


Xhol 


NMB1369(His- 
GST (MC58) 


Fwd 


CGCGGATCCCATATGGCCTGCCAAGACGACA 


BamHI-Ndel 


I\CV 


CCCGCTCG A (rTTfirCTCCTCiCCG AAA 


Xhol 


NMB1551 (His- 
GST)(MC58) 


Fwd 


CGCGGATCCCATATGGCAGAGATCTGTTTGATAA 


BamHI-Ndel 


Rev 


CCCGCTCGAGCGGTTTTCCGCCCAATG 


Xhol 


; NMB1899(His- 
GST)(MC58) 


Fwd 


CGCGGATCCCATATGCAGCCGGATACGGTC 


BamHI-Ndel 


Rev 


CCCGCTCGAGAATCACTTCCAACACAAAAT 


Xhol ! 


NMB20S0- 
(His/GST) 

(MC58) 


Fwd 


CGCGGATCCCATATG-TGGTTGCTGATGAAGGGC 


BamHI-Ndel 


Rev 


CCCGCTCGAG-GACTGC1TCATCTTCTGC 


Xhol 


NMB2050L 
(MC58) 


Fwd 


CGCGGATCCCATATG-GAACTGATGACTGTTTTGC 


Ndel 


Rev 


CCCGCTCGAG-TCAGACTGCTTCATCTTCT 


Xhol 


NMB2159- 
(His/GST) 
(MC58) 

fa-AG287...-Hb 


Fwd 


CGCGGATCCCATATG- 
AGCATTAAAGTAGCGATTAACGGTTTCGGC 


BamHI-Ndel 


Rev 


CCCGCTCGAG- 

GATTTTGCCTGCGAAGTATTCCAAAGTGCG 


Xhol 


Fwd 


CGCGGATCCGCJAGC-CCCGATGTTAAATCGGC 


Nhel 
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Rev 


CGGGGATCC-ATCCTGCTCTTTTTTGCCGG 


Bamffl 


fu-(AG287)-919- 
His 


Fwd 


CGCGGATCCGGTGGTGGTGGT- 
CAAAGCAAGAGCATCCAAACC 


BamHI 




Rev 


CCCAAGCTT-TTCGGGCGGTATTCGGGCTTC 


HindDI 


fii.(AG287)-953- 
His 


Fwd 


CGCGGATCCGGTGGTGGTGGT- 
GCCACCTACAAAGTGGAC 


BamHI 


Rev 


GCCCAAGCTT-TTGTTTGGCTGCCTCGAT 


Hindm 


fu-(AG287)-961- 
His 


X Wvl 


CGCGGATCCGGTGGTGGTGGT-ACAAGCGACGACG 


BamHI 


Rev 

X\X/ V 


GCCCAAGCTT-CCACTCGTAATTGACGCC 

vj \->v_(^yrui\j \^ x x i v^vj i rvn i. x vj*^v^vjv*v^ 


HindUl 


fu-(AG287)- 
Orf46.1-His 


Fwd 


CGCGGATCCGGTGGTGGTGGT- 

tp/" 1 aha r T r r r rr % nr % a a a rvi a ttp 
1 C Avj A 111 uuLAAALuA 1 1 U 


BamHI 


XXCV 


CTC A AGCTT-CGTATC ATATTTC ACGTGC 

^VX^A/VVJVy X X V»VJ X / V X X iX XXX V*.T\V*VJ X VJ V^ 


HindUl 


fii-(AG287-919)- 
Orf46.1-His 


Fwd 


CCCAAGCTTGGTGGTGGTGGTGGT- 

1 vJAvj A 111 vjOCAAACu A 1 1 <^ 


Hindm 


Rev 


CCCGCTCGAG-CGTATCATATTTCACGTGC 

V— 'V— »V_» VJ V-* X V^VJ/V VJ V_*VJ X *\. X K*aX X JTX XXX V^.r^V> VJ X VJ 


Xhol 


fu-(AG287- 
Orf46.1)-919-His 


rwu 


err a Afic^noTnnTfiGTnn'f ggt- 

V-'V-'V^/TLrt.Vj V-» 1 I VJVJ 1 VJVJ 1 VJ VJ 1 1 VJVJ X VJVJ 1 

CAAAGCAAGAGCATCCAAACC 


HindTIT 


Rev 


CCCGCTCGAG-CGGGCGGTATTCGGGCTT 


Xhol 


fuAG287(394.98)- 

• •• 


x wu 


CGCGGATCCGCTAGC-CCCGATGiHrAAATCGGC 

V^VJV^VJVJxx X V^V^VJ V-> X jtVVJ V^ V^V_^V-VJ/A. X VJ X V^VIVJVv 


Nhel 


Rev 


CGGGGATCC-ATCCTGCTCT1TTTTGCCGG 


BamHI 


fn Orfl-(Orf46.1)- 
His 


Fwd 


CGCGGATCCGCTAGC-GGACACACTTATTTCGGCATC 


Nhel 


Rev 


CGCGGATCC-CCAGCGGTAGCCTAATTTGAT 




fu (Orfl)-Orf46.1- 
His 


Fwd 


CGCGGATCCGGTGGTGGTGGT- 
TCAGATxTGGCAAACGATTC 


BamHI 


Rev 


CCCAAGCIT-CGTATCATATTTCACGTGC 


Hindm 


fu (919)-Orf46.1- 
His 


Fwdl 


GCGGCGICGACGGTGGCGGAGGCACTGGATCCTCAG 


SaU 


Fwd2 


GGAGGCACTGGATCCTCAGATTTGGCAAACGATTC 




Rev 


CCCGCTCGAG-CGTATCATATTTCACGTGC 


Xhol 


Fuorf46-.... 


Fwd 


GGAATTCCATATGTCAGATTTGGCAAACGATTC 


Ndel 


Rev 


CGCGGATCCCGTATCATATxTCACGTGC 


Bamffl 


Fu (orf46)-287-ffis 
» 


rwu 


v^vJvJvJvJ/V 1 V^l^vJ vJvJvJVJV^\JVJV*vJvJ 1 VJVJ V-»vJ 


BamHI 

XJCUXIXXX 


XvCV 


v^v»v^nLfVvJv* x xr\x V-v^ 1 UV^ xv^xxixxx vj V^v^VJ VJv^ 


Hindm 


Fu (orf46)-919-His 


Fwd 


CGCGGATCCGGTGGTGGTGGTCAAAGCAAGAGCATCCA 

A AfP 

AALL 


BaniHI 


Day 

rvc v 


CCC A AGCTTCGGGCGGTA TTCGGGCTTC 

V^V^V^/xxVVJ V> X X V^ VJ VJ VJ V^VJ VJ X t\ X X V-VJ VJVJ V^ X 1 V-» 


Hindm 


Fu(orf46-919). 
287-His 


Fwd 


CCCCAAQCHGGGGGCGGCGGTGGCG 


Hindm 


Rev 


CCCGCTCGAGATCCTGCTCrrrriTGCCGGC 


Xhol 


Fu(orf46-287)- 
919-His 


Fwd 


CCCMQflTGGTGGTGGTGGTGGTCAAAGCAAGAGCAT 
CCAAACC 


Hindm 


Rev 


CCCGCICGMCGGGCGGTATTCGGGCTT 


Xhol 


(AG741 )-961c-His 


Fwdl 
Fwd2 


GGAGGCACTGGATCCGCAGCCACAAACGACGACGA 
GCGGCCTCGAG-GGTGGCGGAGGCACTGGATCCGCAG 


Xhol 


Rev 


CCCGCTCGAG-ACCCAGCTTGTAAGGTTG 


Xhol 


(AG741 )-961-His 


Fwdl 
Fwd2 


GGAGGCACTGGATCCGCAGCCACAAACGACGACGA 
GCGGCCTCGAG-GGTGGCGGAGGCACTGGATCCGCAG 


Xhol 


Rev 


CCCGCTCGAG-CCACTCGTAATTGACGCC 


Xhol 
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(AG741 )-983-His 


Fwd 


GCGGCCTCGAG- 

GGATCCGGCGGAGGCGGCACTTCTGCG 


Xhol 


Rev 


CCCGCTCGAG-GAACCGGTAGCCTACG 


Xhol 


(AG741 )-orf46.1- 
His 


Fwdl 
Fwd2 


GGAGGCACTGGATCCTCAGATTTGGCAAACGATTC 
GCGGCGTCGACGGTGGCGGAGGCACTGGATCCTCAGA 


SaU 


Rev 


CCCGCTCGAG-CGTATCATATTTCACGTGC 


Xhol 


(AG983)- 
741(MC58)-His 


Fwd 


GCGGCCTCGAG-GGATCCGGAGGGGGTGGTGTCGCC 


Xhol 


Rev 


CCCGCTCGAG-TTGCTTGGCGGCAAG 


Xhol 


(AG983)-961c-His 


Fwdl 
Fwd2 


GGAGGCACTGGATCCGCAGCCACAAACGACGACGA 
GCGGCCTCGAG-GGTGGCGGAGGCACTGGATCCGCACi 


Xhol 




Rev 


CCCGCTCGAG-ACCCAGCTTGTAAGGTTG 


Xhol 


(AG983)-961-His 


Fwdl 
Fwd2 


GGAGGCACTGGATCCGCAGCCACAAACGACGACGA 
GCGGCClCSAfi-GGTGGCGGAGGCACTGGATCCGCAG 


Xhol 


Rev 


CCCGCTCGAG-CCACTCGTAATTGACGCC 


Xhol 


(AG983)- Orf46.1- 
His 


Fwdl 
Fwd2 


GGAGGCACTGGATCCTCAGATTTGGCAAACGATTC 
GCGGCGTCGACGGTGGCGGAGGCACTGGATCCTCAGA 


Sail 


Rev 


CCCGCTCGAG-CGTATCATATTTCACGTGC 


Xhol 



This primer was used as a Reverse primer for all the C terminal fusions of 287 to the His-tag. 

5 Forward primers used in combination with the 287-His Reverse primer. 
NB - All PCR reactions use strain 2996 unless otherwise specified (e.g. strain MC58) 



In all constructs starting with an ATG not followed by a unique Nhel site, the ATG codon is 
5 part of the Ndel site used for cloning. The constructs made using Nhel as a cloning site at the 
5' end {e.g. all those containing 287 at the N-terminus) have two additional codons (GCT 
AGC) fused to the coding sequence of the antigen. 

Preparation of chromosomal DN A templates 

N.meningitidis strains 2996, MC58, 394.98, 1000 and BZ232 (and others) were grown to 
10 exponential phase in 100ml of GC medium, harvested by centrifugation, and resuspended in 
5ml buffer (20% w/v sucrose, 50mM Tris-HCl, 50mM EDTA, pH8). After 10 minutes 
incubation on ice, the bacteria were lysed by adding 10ml of lysis solution (50mM NaCl, 1% 
Na-Sarkosyl, 50|ig/ml Proteinase K), and the suspension incubated at 37°C for 2 hours. Two 
phenol extractions (equilibrated to pH 8) and one CHCls/isoamylalcohol (24:1) extraction 
15 were performed. DNA was precipitated by addition of 0.3M sodium acetate and 2 volumes 
of ethanol, and collected by centrifugation. The pellet was washed once with 70%(v/v) 
ethanol and redissolved in 4.0ml IE buffer (lOmM Tris-HCl, ImM EDTA, pH 8.0). The 
DNA concentration was measured by reading OD260- 
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PCR Amplification 

The standard PCR protocol was as follows: 200ng of genomic DNA from 2996, MC581000, 
or BZ232 strains or lOng of plasmid DNA preparation of recombinant clones were used as 
template in the presence of 40pM of each oligonucletide primer, 400-800 juM dNTPs 
5 solution, lx PCR buffer (including 1.5mM MgCl 2 ), 2.5 units TaqI DNA polymerase (using 
Perkin-Elmer AmpliTaQ, Boerhingher Mannheim Expand™ Long Template). 

After a preliminary 3 minute incubation of the whole mix at 95°C, each sample underwent a 
two-step amplification: the first 5 cycles were performed using the hybridisation temperature 
that excluded the restriction enzyme tail of the primer (T m i). This was followed by 30 cycles 
10 according to the hybridisation temperature calculated for the whole length oligos (T^). 
Elongation times, performed at 68°C or 72°C, varied according to the length of the Orf to be 
amplified. In the case of Orfl the elongation time, starting from 3 minutes, was increased by 
15 seconds each cycle. The cycles were completed with a 10 minute extension step at 72°C 

The amplified DNA was either loaded directly on a 1% agarose gel The DNA fragment 
15 corresponding to the band of correct size was purified from the gel using the Qiagen Gel 
Extraction Kit, following the manufacturer's protocol. 

Digestion of PCR fragments and of the cloning vectors 

The purified DNA corresponding to the amplified fragment was digested with the 
appropriate restriction enzymes for cloning into pET-21b+, pET22b+ or pET-24b+. Digested 
20 fragments were purified using the QIAquick PCR purification kit (following the 
manufacturer's instructions) and eluted with either H 2 0 or lOmM Tris, pH 8.5. Plasmid 
vectors were digested with the appropriate restriction enzymes, loaded onto a 1.0% agarose 
gel and the band corresponding to the digested vector purified using the Qiagen QIAquick 
Gel Extraction Kit. 

25 Cloning 

The fragments corresponding to each gene, previously digested and purified, were ligated 
into pET21b+, pET22b+ or pET-24b+. A molar ratio of 3:1 fragment/vector was used with 
T4 DNA ligase in the ligation buffer supplied by the manufacturer. 

Recombinant plasmid was transformed into competent Exoli DH5 or HB101 by incubating 
30 the ligase reaction solution and bacteria for 40 minutes on ice, then at 37°C for 3 minutes. 



WO 01/64922 



PCTYIB01/00452 



-94- 

This was followed by the addition of 800^1 LB broth and incubation at 37°C for 20 minutes. 
The cells were centrifuged at maximum speed in an Eppendorf microfuge, resuspended in 
approximately 200|nl of the supernatant and plated onto LB ampicillin (lOOmg/ml ) agar. 

Screening for recombinant clones was performed by growing randomly selected colonies 
5 overnight at 37°C in 4.0ml of LB broth + 100ng/ml ampicillin. Cells were pelleted and 
plasmid DNA extracted using the Qiagen QIAprep Spin Miniprep Kit, following the 
manufacturer's instructions. Approximately lfig of each individual miniprep was digested 
with the appropriate restriction enzymes and the digest loaded onto a 1-1.5% agarose gel 
(depending on the expected insert size), in parallel with the molecular weight marker (Ikb 
10 DNA Ladder, GIBCO). Positive clones were selected on the basis of the size of insert. 

Expression 

After cloning each gene into the expression vector, recombinant plasmids were transformed 
into E.coli strains suitable for expression of the recombinant protein, ljil of each construct 
was used to transform E.coli BL21-DE3 as described above. Single recombinant colonies 

15 were inoculated into 2ml LB+Amp (lOOpg/ml), incubated at 37°C overnight, then diluted 
1:30 in 20ml of LB+Amp (100|ag/ml) in 100ml flasks, to give an OD 6 oo between 0.1 and 0.2. 
The flasks were incubated at 30°C or at 37°C in a gyratory water bath shaker until OD 6 oo 
indicated exponential growth suitable for induction of expression (0.4-0.8 OD). Protein 
expression was induced by addition of l.OmM IPTG. After 3 hours incubation at 30°C or 

20 37°C the ODeoo was measured and expression examined. 1.0ml of each sample was 
centrifuged in a microfuge, the pellet resuspended in PBS and analysed by SDS-PAGE and 
Coomassie Blue staining. 

Gateway cloning and expression 

Sequences labelled GATE were cloned and expressed using the GATEWAY Cloning 
25 Technology (GIBCO-BRL). Recombinational cloning (RQ is based on the recombination 
reactions that mediate the integration and excision of phage into and from the Ecoli genome, 
respectively. The integration involves recombination of the attP site of the phage DNA within 
the attB site located in the bacterial genome (BP reaction) and generates an integrated phage 
genome flanked by attL and attR sites. The excision recombines attL and attR sites back to attP 
30 and attB sites (LR reaction). The integration reaction requires two enzymes [the phage protein 
Integrase (Int) and the bacterial protein integration host factor (JHF)] (BP clonase). The 
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excision reaction requires Int, IHF, and an additional phage enzyme, Excisionase (Xis) (LR 
clonase). Artificial derivatives of the 25~bp bacterial attB recombination site, referred to as Bl 
and B2, were added to the 5' end of the primers used in PCR reactions to amplify Neisserial 
ORFs. The resulting products were BP cloned into a "Donor vector" containing complementary 
5 derivatives of the phage attP recombination site (PI and P2) using BP clonase. The resulting 
"Entry clones" contain ORFs flanked by derivatives of the attL site (LI and L2) and were 
subcloned into expression "destination vectors" which contain derivatives of the attL- 
compatible attR sites (Rl and R2) using LR clonase. This resulted in "expression clones" in 
which ORFs are flanked by B 1 and B2 and fused in frame to the GST or His N terminal tags. 

10 The R coli strain used for GATEWAY expression is BL21-SI. Cells of this strain are induced 
for expression of the T7 RNA polymerase by growth in medium containing salt (0.3 M NaCl). 

Note that this system gives N-terminus His tags. 
Preparation of membrane proteins. 

Fractions composed principally of either inner, outer or total membrane were isolated in 
15 order to obtain recombinant proteins expressed with membrane-localisation leader 
sequences. The method for preparation of membrane fractions, enriched for recombinant 
proteins, was adapted from Filip et. al [J.Bact (1973) 115:717-722] and Davies et al 
[JJmmunoLMeth. (1990) 143:215-225]. Single colonies harbouring the plasmid of interest 
were grown overnight at 37°C in 20 ml of LB/Amp (100 jig/ml) liquid culture. Bacteria were 
20 diluted 1:30 in 1.0 L of fresh medium and grown at either 30°C or 37°C until the OD 55 o 
reached 0.6-0.8. Expression of recombinant protein was induced with IPTG at a final 
concentration of 1.0 mM. After incubation for 3 hours, bacteria were harvested by 
centrifiigation at 8000g for 15 minutes at 4°C and resuspended in 20 ml of 20 mM Tris-HCl 
(pH 7.5) and complete protease inhibitors (Boehringer-Mannheim). All subsequent 
25 procedures were performed at 4°C or on ice. 

Cells were disrupted by sonication using a Branson Sonifier 450 and centrifiiged at 5OQ0g 
for 20 min to sediment unbroken cells and inclusion bodies. The supernatant, containing 
membranes and cellular debris, was centrifuged at 50000g (Beckman Ti50, 29000rpm) for 
75 min, washed with 20 mM Bis-tris propane (pH 6.5), 1.0 M NaCl, 10% (v/v) glycerol and 
30 sedimented again at 50000& for 75 minutes. The pellet was resuspended in 20mM Tris-HCl 
(pH 7.5), 2.0% (v/v) Sarkosyl, complete protease inhibitor (1.0 mM EDTA, final 
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concentration) and incubated for 20 minutes to dissolve inner membrane. Cellular debris was 
pelleted by centrifugation at 5000g for 10 min and the supernatant centrifiiged at 75000# for 
75 minutes (Beckman Ti50, 33000rpm). Proteins 008L and 519L were found in the 
supernatant suggesting inner membrane localisation. For these proteins both inner and total 
5 membrane fractions (washed with NaCl as above) were used to immunise mice. Outer 
membrane vesicles obtained from the 75000g pellet were washed with 20 mM Tris-HCl (pH 
7.5) and centrifuged at 75000g for 75 minutes or overnight. The OMV was finally 
resuspended in 500 |il of 20 mM Tris-HCl (pH 7.5), 10% v/v glycerol. OrflL and Orf40L 
were both localised and enriched in the outer membrane fraction which was used to 
10 immunise mice. Protein concentration was estimated by standard Bradford Assay (Bio-Rad), 
while protein concentration of inner membrane fraction was determined with the DC protein 
assay (Bio-Rad). Various fractions from the isolation procedure were assayed by SDS-PAGE. 

Purification offfis-tagged proteins 

Various forms of 287 were cloned from strains 2996 and MC58. They were constructed with 

15 a C-terminus His-tagged fusion and included a mature form (aa 18-427), constructs with 
deletions (A 1, A 2, A3 and A4) and clones composed of either B or C domains. For each 
clone purified as a His-fusion, a single colony was streaked and grown overnight at 37°C on 
a LB/Amp (100 |ig/ml) agar plate. An isolated colony from this plate was inoculated into 
20ml of LB/Amp (100 ng/ml) liquid medium and grown overnight at 37°C with shaking. 

20 The overnight culture was diluted 1:30 into 1.0 L LB/Amp (100 |ig/ml) liquid medium and 
allowed to grow at the optimal temperature (30 or 37°C) until the OD 550 reached 0.6-0.8. 
Expression of recombinant protein was induced by addition of EPTG (final concentration 
l.OmM) and the culture incubated for a further 3 hours. Bacteria were harvested by 
centrifugation at 8000g for 15 min at 4°C. The bacterial pellet was resuspended in 7.5 ml of 

25 either (i) cold buffer A (300 mM NaCl, 50 mM phosphate buffer, 10 mM imidazole, pH 8.0) 
for soluble proteins or (ii) buffer B (lOmM Tris-HCl, 100 mM phosphate buffer, pH 8.8 and, 
optionally, 8M urea) for insoluble proteins. Proteins purified in a soluble form included 
287-His, Al, A2, A3 and A4287-His, A4287MC58-His, 287c-ffis and 287cMC58-His. 
Protein 287bMC58-His was insoluble and purified accordingly. Cells were disrupted by 

30 sonication on ice four times for 30 sec at 40W using a Branson sonifier 450 and centrifuged 
at 13000xg for 30 min at 4°C. For insoluble proteins, pellets were resuspended in 2.0 ml 
buffer C (6 M guanidine hydrochloride, 100 mM phosphate buffer, 10 mM Tris- HC1, pH 7.5 
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and treated with 10 passes of a Dounce homogenizer. The homogenate was centrifuged at 
13000g for 30 min and the supernatant retained. Supernatants for both soluble and insoluble 
preparations were mixed with 150|il Ni 2+ -resin (previously equilibrated with either buffer A 
or buffer B, as appropriate) and incubated at room temperature with gentle agitation for 30 
5 min. The resin was Chelating Sepharose Fast Flow (Pharmacia), prepared according to the 
manufacturer's protocol. The batch-wise preparation was centrifuged at 700g for 5 min at 
4°C and the supernatant discarded. The resin was washed twice (batch-wise) with 10ml 
buffer A or B for 10 min, resuspended in 1.0 ml buffer A or B and loaded onto a disposable 
column. The resin continued to be washed with either (i) buffer A at 4°C or (ii) buffer B at 
10 room temperature, until the OD 2 go of the flow-through reached 0.02-0.01. The resin was 
further washed with either (i) cold buffer C (300mM NaCl, 50mM phosphate buffer, 20mM 
imidazole, pH 8.0) or (ii) buffer D (lOmM Tris-HCl, lOOmM phosphate buffer, pH 6.3 and, 
optionally, 8M urea) until OD 2 so of the flow-through reached 0.02-0.01. The His-fusion 
protein was eluted by addition of 700jal of either (i) cold elution buffer A (300 mM NaCl, 
15 50mM phosphate buffer, 250 mM imidazole, pH 8.0) or (ii) elution buffer B (10 mM 
Tris-HCl, 100 mM phosphate buffer, pH 4.5 and, optionally, 8M urea) and fractions 
collected until the OD 2 8o indicated all the recombinant protein was obtained. 20|il aliquots of 
each elution fraction were analysed by SDS-PAGE. Protein concentrations were estimated 
using the Bradford assay. 

Renatiiration of denatured His-fusion proteins. 

Denaturation was required to solubilize 287bMC8, so a renaturation step was employed prior 
to immunisation. Glycerol was added to the denatured fractions obtained above to give a 
final concentration of 10% v/v. The proteins were diluted to 200 Mg/ml using dialysis buffer 
I (10% v/v glycerol, 0.5M arginine, 50 mM phosphate buffer, 5.0 mM reduced glutathione, 
0.5 mM oxidised glutathione, 2.0M urea, pH 8.8) and dialysed against the same buffer for 
12-14 hours at 4°C. Further dialysis was performed with buffer II (10% v/v glycerol, 0.5M 
arginine, 50mM phosphate buffer, 5.0mM reduced glutathione, 0.5mM oxidised glutathione, 
pH 8.8) for 12-14 hours at 4°C. Protein concentration was estimated using the formula: 
Protein (mg/ml) = (7.55 x OD m ) - (0.76 x OD 26 o) 



20 



25 
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Amino acid sequence analysis. 

Automated sequence analysis of the NH2-terminus of proteins was performed on a Beckman 
sequencer (LF 3000) equipped with an on-line phenylthiohydantoin-amino acid analyser 
(System Gold) according to the manufacturer's recommendations. 

5 Immunization 

Balb/C mice were immunized with antigens on days 0, 21 and 35 and sera analyzed at day 49. 
Sera analysis - EUSA 

The acapsulated MenB M7 and the capsulated strains were plated on chocolate agar plates 
and incubated overnight at 37°C with 5% C0 2 . Bacterial colonies were collected from the 

10 agar plates using a sterile dracon swab and inoculated into Mueller-Hinton Broth (Difco) 
containing 0.25% glucose. Bacterial growth was monitored every 30 minutes by following 
OD 6 20- The bacteria were let to grow until the OD reached the value of 0.4-0.5. The culture 
was centrifuged for 10 minutes at 4000rpm. The supernatant was discarded and bacteria 
were washed twice with PBS, resuspended in PBS containing 0.025% formaldehyde, and 

15 incubated for 1 hour at 37°C and then overnight at 4°C with stirring. 100^1 bacterial cells 
were added to each well of a 96 well Greiner plate and incubated overnight at 4°C. The wells 
were then washed three times with PBT washing buffer (0.1% Tween-20 in PBS). 200fil of 
saturation buffer (2.7% polyvinylpyrrolidone 10 in water) was added to each well and the 
plates incubated for 2 hours at 37°C. Wells were washed three times with PBT. 200^1 of 

20 diluted sera (Dilution buffer: 1% BSA, 0.1% Tween-20, 0.1% NaN 3 in PBS) were added to 
each well and the plates incubated for 2 hours at 37°C. Wells were washed three times with 
PBT. lOOjil of HRP-conjugated rabbit anti-mouse (Dako) serum diluted 1:2000 in dilution 
buffer were added to each well and the plates were incubated for 90 minutes at 37°C. Wells 
were washed three times with PBT buffer. 100^1 of substrate buffer for HRP (25ml of citrate 

25 buffer pH5, lOmg of O-phenildiamine and 10|jl of H 2 0 2 ) were added to each well and the 
plates were left at room temperature for 20 minutes. lOOfil 12.5% H 2 S0 4 was added to each 
well and OD490 was followed. The ELIS A titers were calculated abitrarely as the dilution of 
sera which gave an OD 490 value of 0.4 above the level of preimmune sera. The ELIS A was 
considered positive when the dilution of sera with OD490 of 0.4 was higher than 1:400. 

30 Sera analysis - FACS Scan bacteria binding assay 

The acapsulated MenB M7 strain was plated on chocolate agar plates and incubated 
overnight at 37°C with 5% C0 2 . Bacterial colonies were collected from the agar plates using 
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a sterile dracon swab and inoculated into 4 tubes containing 8ml each Mueller-Hinton Broth 
(Difco) containing 0.25% glucose. Bacterial growth was monitored every 30 minutes by 
following OD620- The bacteria were let to grow until the OD reached the value of 0.35-0.5. 
The culture was centrifuged for 10 minutes at 4000rpm. The supernatant was discarded and 
5 the pellet was resuspended in blocking buffer (1% BSA in PBS, 0.4% NaN 3 ) and centrifuged 
for 5 minutes at 4000rpm. Cells were resuspended in blocking buffer to reach OD620 of 0.05. 
lOOpl bacterial cells were added to each well of a Costar 96 well plate. lOOpl of diluted 
(1:100, 1:200, 1:400) sera (in blocking buffer) were added to each well and plates incubated 
for 2 hours at 4°C. Cells were centrifuged for 5 minutes at 4000rpm, the supernatant 

10 aspirated and cells washed by addition of 200|il/well of blocking buffer in each well. 100^1 
of R-Phicoerytrin conjugated F(ab) 2 goat anti-mouse, diluted 1:100, was added to each well 
and plates incubated for 1 hour at 4°C. Cells were spun down by centrifugation at 4000rpm 
for 5 minutes and washed by addition of 200^1/well of blocking buffer. The supernatant was 
aspirated and cells resuspended in 200)il/well of PBS, 0.25% formaldehyde. Samples were 

15 transferred to FACScan tubes and read. The condition for FACScan (Laser Power 15mW) 
setting were: FL2 on; FSC-H threshold:92; FSC PMT Voltage: E 01; SSC PMT: 474; Amp. 
Gains 6.1; FL-2 PMT: 586; compensation values: 0. 

Sera analysis - bactericidal assay 

N. meningitidis strain 2996 was grown overnight at 37 °C on chocolate agar plates (starting 
20 from a frozen stock) with 5% C0 2 . Colonies were collected and used to inoculate 7ml 
Mueller-Hinton broth, containing 0.25% glucose to reach an OD 620 of 0.05-0.08. The culture 
was incubated for approximately 1.5 hours at 37 degrees with shacking until the OD620 
reached the value of 0.23-0.24. Bacteria were diluted in 50mM Phosphate buffer pH 7.2 
containing lOmM MgCl 2 , lOmM CaCl 2 and 0.5% (w/v) BSA (assay buffer) at the working 
25 dilution of 10 5 CFU/ml. The total volume of the final reaction mixture was 50 |il with 25 |il 
of serial two fold dilution of test serum, 12.5 \x\ of bacteria at the working dilution, 12.5 ^1 of 
baby rabbit complement (final concentration 25% ). 

Controls included bacteria incubated with complement serum, immune sera incubated with 
bacteria and with complement inactivated by heating at 56°C for 30\ Immediately after the 
30 addition of the baby rabbit complement, lOjil of the controls were plated on Mueller-Hinton 
agar plates using the tilt method (time 0). The 96-wells plate was incubated for 1 hour at 
37°C with rotation. 7(j1 of each sample were plated on Mueller-Hinton agar plates as spots, 
whereas 10|il of the controls were plated on Mueller-Hinton agar plates using the tilt method 
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(time 1). Agar plates were incubated for 18 hours at 37 degrees and the colonies 
corresponding to time 0 and time 1 were counted. 

Sera analysis - western blots 

Purified proteins (500ng/lane), outer membrane vesicles (5pg) and total cell extracts (25^g) 
5 derived from MenB strain 2996 were loaded onto a 12% SDS-polyacrylamide gel and 
transferred to a nitrocellulose membrane. The transfer was performed for 2 hours at 150mA 
at 4°C, using transfer buffer (0.3% Tris base, 1.44% glycine, 20% (v/v) methanol). The 
membrane was saturated by overnight incubation at 4°C in saturation buffer (10% skimmed 
milk, 0.1% Triton XI 00 in PBS). The membrane was washed twice with washing buffer (3% 
10 skimmed milk, 0.1% Triton X100 in PBS) and incubated for 2 hours at 37°C with mice sera 
diluted 1:200 in washing buffer. The membrane was washed twice and incubated for 90 
minutes with a 1:2000 dilution of horseradish peroxidase labelled anti-mouse Ig. The 
membrane was washed twice with 0.1% Triton X100 in PBS and developed with the Opti- 
4CN Substrate Kit (Bio-Rad). The reaction was stopped by adding water. 

15 The OMVs were prepared as follows: N. meningitidis strain 2996 was grown overnight at 37 
degrees with 5% CO2 on 5 GC plates, harvested with a loop and resuspended in 10 ml of 
20mM Tris-HCl pH 7.5, 2 mM EDTA. Heat inactivation was performed at 56°C for 45 
minutes and the bacteria disrupted by sonication for 5 minutes on ice (50% duty cycle, 50% 
output , Branson sonifier 3 mm microtip). Unbroken cells were removed by centrifugation at 

20 5000g for 10 minutes, the supernatant containing the total cell envelope fraction recovered 
and further centrifuged overnight at 50000# at the temperature of 4°C . The pellet containing 
the membranes was resuspended in 2% sarkosyl, 20mM Tris-HCl pH 7.5, 2 mM EDTA and 
incubated at room temperature for 20 minutes to solubilise the inner membranes. The 
suspension was centrifuged at lOOOOg for 10 minutes to remove aggregates, the supernatant 

25 was further centrifuged at 50000g for 3 hours. The pellet, containing the outer membranes 
was washed in PBS and resuspended in the same buffer. Protein concentration was measured 
by the D.C Bio-Rad Protein assay (Modified Lowry method), using BSA as a standard. 

Total cell extracts were prepared as follows: N. meningitidis strain 2996 was grown 
overnight on a GC plate, harvested with a loop and resuspended in 1ml of 20mM Tris-HCl. 
30 Heat inactivation was performed at 56°C for 30 minutes. 

961 domain studies 

Cellular fractions preparation Total lysate, periplasm, supernatant and OMV of Kcoli clones 
expressing different domains of 961 were prepared using bacteria from over-night cultures or 
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after 3 hours induction with IPTG. Briefly, the periplasm were obtained suspending bacteria 
in saccarose 25% and Tris 50mM (pH 8) with polimixine lOOjig/ml. After lhr at room 
temperature bacteria were centrifuged at 13G00rpm for 15 min and the supernatant were 
collected. The culture supernatant were filtered with 0.2pm and precipitated with TCA 50% 
5 in ice for two hours. After centrifugation (30 min at 13000 rp) pellets were rinsed twice with 
ethanol 70% and suspended in PBS. The OMV preparation was performed as previously 
described. Each cellular fraction were analyzed in SDS-PAGE or in Western Blot using the 
polyclonal anti-serum raised against GST-96L 

Adhesion assay Chang epithelial cells (Wong-Kilbourne derivative, clone l-5c-4, human 
10 conjunctiva) were maintained in DMEM (Gibco) supplemented with 10% heat-inactivated 
FCS, 15mM L-glutamine and antibiotics. 

For the adherence assay, sub-confluent culture of Chang epithelial cells were rinsed with 
PBS and treated with trypsin-EDTA (Gibco), to release them from the plastic support. The 
cells were then suspended in PBS, counted and dilute in PBS to 5xl0 5 cells/ml. 

15 Bacteria from over-night cultures or after induction with IPTG, were pelleted and washed 
twice with PBS by centrifuging at 13000 for 5 min. Approximately 2-3xl0 8 (cfu) were 
incubated with 0.5 mg/ml FITC (Sigma) in 1ml buffer containing 50mM NaHC0 3 and 
lOOmM NaCl pH 8, for 30 min at room temperature in the dark. FITC-labeled bacteria were 
wash 2-3 times and suspended in PBS at l-1.5xl0 9 /ml. 200^1 of this suspension (2-3xl0 8 ) 

20 were incubated with 200^1 (lxl 0 5 ) epithelial cells for 30min a 37°C. Cells were than 
centrifuged at 2000rpm for 5 min to remove non-adherent bacteria, suspended in 200^1 of 
PBS, transferred to FACScan tubes and read 
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CLAIMS 

1. A method for the heterologous expression of a protein of the invention, in which (a) at 
least one domain in the protein is deleted and, optionally, (b) no fusion partner is used. 

2. The method of claim 1, in which the protein of the invention is ORF46. 

5 3. The method of claim 2, in which ORF46 is divided into a first domain (amino acids 
1-433) and a second domain (amino acids 433-608). 

4. The method of claim 2, in which the protein of the invention is 564. 

5. The method of claim 4, in which protein 564 is divided into domains as shown in Figure 
8. 

10 6. The method of claim 1 in which the protein of the invention is 96 1 . 

7. The method of claim 6, in which protein 961 is divided into domains as shown in Figure 
12. 

8. The method of claim 1, in which* the protein of the invention is 502 and the domain is 
amino acids 28 to 167 (numbered according to the MC58 sequence). 

15 9. The method of claim 1 , in which the protein of the invention is 287. 

10. A method for the heterologous expression of a protein of the invention, in which (a) a 
portion of the N-terminal domain of the protein is deleted. 

11. The method of claim 9 or claim 10, in which protein 287 is divided into domains A B & 
C shown in Figure 5. 

20 12. The method of claim 1 1, in which (i) domain A, (ii) domains A and B, or (iii) domains A 
and C are deleted. 

13. The method of claim 11, wherein (i) amino acids 1-17, (ii) amino acids 1-25, (iii) amino 
acids 1-69, or (iv) amino acids 1-106, of domain A are deleted. 

14. A method for the heterologous expression of a protein of the invention, in which (a) no 
25 fusion partner is used, and (b) the protein's native leader peptide (if present) is used. 
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15. The method of claim 14, in which the protein of the invention is selected from the group 
consisting of: 111, 149, 206, 225-1, 235, 247-1, 274, 283, 286, 292, 401, 406, 502-1, 
503, 519-1, 525-1, 552, 556, 557, 570, 576-1, 580, 583, 664, 759, 907, 913, 920-1, 936- 
1, 953, 961, 983, 989, Orf4, Orf7-l, Orf9-l, Orf23, Orf25, Orf37, Orf38, Orf40, Orf40.1, 

5 Orf40.2, Orf72-l, Orf76-l, Orf85-2, Orf91, Orf97-l, Orfll9, Orfl43.1, NMB0109, 
NMB2050, 008, 105, 117-1, 121-1, 122-1, 128-1, 148, 216, 243, 308, 593, 652, 726, 
926, 982, Orf83-l and Orfl43-l. 

16. A method for the heterologous expression of a protein of the invention, in which (a) the 
protein's leader peptide is replaced by the leader peptide from a different protein and, 

10 optionally, (b) no fusion partner is used. 

17. The method of claim 16, in which the different protein is 961, ORF4, Exoli OmpA, or 
E.carotovora PelB, or in which the leader peptide is MKKYLFSAA. 

18. The method of claim 17, in which the different protein is Exoli OmpA and the protein of 
the invention is ORF1. 

15 19. The method of claim 17, in which the protein of the invention is 911 and the different 
protein is Exarotovora PelB or Exoli OmpA. 

20. The method of claim 17, in which the different protein is ORF4 and the protein of the 
invention is 287. 

21. A method for the heterologous expression of a protein of the invention, in which (a) the 
20 protein's leader peptide is deleted and, optionally, (b) no fusion partner is used. 

22. The method of claim 21, in which the protein of the invention is 919. 

23. A method for the heterologous expression of a protein of the invention, in which 
expression of a protein of the invention is carried out at a temperature at which a toxic 
activity of the protein is not manifested. 

25 24. The method of claim 23, in which protein 919 is expressed at 30°C. 

25. A method for the heterologous expression of a protein of the invention, in which protein 
is mutated to reduce or eliminate toxic activity. 
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26. The method of claim 25, in which the protein of the invention is 907, 919 or 922. 

27. The method of claim 26, in which 907 is mutated at Glu-1 17 (e.g. Glu->Gly). 

28. The method of claim 26, in which 919 is mutated at Glu-255 (e.g. Glu-+Gly) and/or 
Glu-323 (e.g. Glu-+Gly). 

5 29. The method of claim 26, in which 922 is mutated at Glu-164 (e.g. Glu->Gly), Ser-213 
(e.g. Ser->Gly) and/or Asn-348 (e.g. Asn->Gly). 

30. A method for the heterologous expression of a protein of the invention, in which vector 
pSM214 is used or vector pET-24b is used. 

31. The method of claim 30, in which the protein of the invention is 953 and the vector is 
10 pSM214. 

32. A method for the heterologous expression of a protein of the invention, in which a 
protein is expressed or purified such that it adopts a particular multimeric form. 

33. The method of claim 32, in which protein 953 is expressed and/or purified in monomeric 
form. 

15 34. The method of claim 32, in which protein 961 is expressed and/or purified in tetrameric 
form. 

35. The method of claim 32, in which protein 287 is expressed and/or purified in dimeric 
form. 

36. The method of claim 32, in which protein 919 is expressed and/or purified in monomeric 
20 form. 

37. A method for the heterologous expression of a protein of the invention, in which the 
protein is expressed as a lipidated protein. 

38. The method of claim 37, in which the protein of the invention is 919, 287, ORF4, 406, 
576, or ORF25. 

25 39. A method for the heterologous expression of a protein of the invention, in which (a) the 
protein's C-terminus region is mutated and, optionally, (b) no fusion partner is used. 
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40. The method of claim 39, wherein the mutation is a substitution, an insertion, or a deletion 

41. The method of claim 40, wherein the protein of the invention is 730, ORF29 or ORF46. 

42. A method for the heterologous expression of a protein of the invention, in which the 
protein's leader peptide is mutated. 

5 43. The method of claim 42, in which the protein of the invention is 919. 

44. A method for the heterologous expression of a protein, in which a poly-glycine stretch 
within the protein is mutated. 

45. The method of claim 44, wherein the protein is a protein of the invention. 

46. The method of claim 45, wherein the protein of the invention is 287, 741, 983 or Tbp2. 
10 47. The method of claim 46, wherein (Gly) 6 is deleted from 287 or 983. 

48. The method of claim 46, wherein (Gly>4 is deleted from Tbp2 or 741 

49. The method of claim 47 or claim 48, wherein the leader peptide is also deleted. 

50. The method of any preceding claim, in which the heterologous expression is in an E.coli 
host. 

15 51. A protein expressed by the method of any preceding claim. 

52. A heterologous protein comprising the N-terminal amino acid sequence MKKYLFSAA. 
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